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Some Fundamental Control Theory I: 
Controllability, Observability, and Duality 


William J. Terrell 


1. INTRODUCTION. It is well known that a single n-th order nonhomogeneous 
linear differential equation is equivalent to a system of n first order linear 
differential equations. Specifically, an n-th order linear equation 

ye? “ti Kye! op kay"? ee +k,y 7 u(t), | (1) 


with real constant coefficients k,;, is equivalent, via the standard definition of the 
vector variable z =[y y’ y”... y"~ ]’, to the linear system 


z' = Pz + du(t), (2) 
where 
0 1 0 0 
P= ' ; ' ues ; (3) 
Ke, = Kie4 —K,-9 —k, 


is a companion matrix, the k, are as in (1), and 
d=[00...1]' (4) 


is the n-th standard basis vector. 
What about the converse? When can a constant coefficient linear system 


x’ = Ax + bu(t), (5) 


where A is n Xn and bD is n X 1, be transformed to (2) by a nonsingular linear 
transformation of the state variable, z = Tx, where T is a constant matrix? Since 
z' = Tx’ = (TAT | \(Tx) + Thu = (TAT~')z + (Tb)u, we are led to ask: When is 
there a nonsingular T such that TAT~' is a companion matrix and Tb is the n-th 
standard basis vector? 

The answer to this question is known [8, Chapter 2], although it seems not to be 
common knowledge outside the mathematical control community. A linear trans- 
formation of the state x that transforms (5) to (2) is not always possible, as can be 
seen by considering the diagonal system 


J-[o fe] [} c 


In advanced courses in dynamics the subject of normal forms achieved by coordi- 
nate transformations is an important topic. And in the literature of mathematical 
control theory, questions concerning alternative system representations have al- 
ways been important. However, we know of no elementary differential equations 
text outside the control-theoretic literature that systematically addresses the ques- 
tion of a transformation from (5) to (2). 
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The primary purpose of this article is to introduce a circle of ideas in mathemat- 
ical control theory. The approach is via the question of “equivalence” between 
n-dimensional first order linear systems like (5) and n-th order linear equations 
like (1). The full answer to the equivalence question introduces some of the central 
concepts of modern control theory. We derive some classical results concerning the 
important control-theoretic concepts of controllability and observability. We also 
consider the relationships of these concepts with other important topics in control, 
such as stabilization of equilibria, and linearization of nonlinear systems using 
coordinate change and state feedback. 


¢ In Sections 2 and 3 we clarify the relationship between the system (5) and the 
equation (1) and derive a necessary and sufficient condition for equivalence. 

¢ In Section 4 we explore the equivalence condition of Section 3 by motivating 
and explaining its meaning as a controllability condition. We then rephrase 
our original equivalence problem and introduce the concept of observability. 
An easy step in Section 5 then shows the algebraic duality of controllability 
and observability. 

_ ¢ In Section 6 we indicate briefly the importance of these developments to 

questions of asymptotic behavior such as stability. 

¢ Finally, Section 7 briefly discusses some extensions of Sections 4—6 to the case 
of linear systems with multivariable input and multivariable output. 


2. A SIMPLE EXAMPLE. Let us begin with a naive approach to transforming a 
simple example and then consider a precise definition of linear equivalence 
of systems. 


Example 1. Consider the system, 
Mi 2 a a (7a) 
X5 =X, —X, (7b) 
which has the form (5) with 


ee fi 
a-1t if &=[o} 
Differentiate (7b) and substitute from (7a) to obtain x5 + 3x, = u(t). This second 
order equation for x, has the form (1) for m = 2, and a solution of it for x,(¢) 
determines a function x,(t) (using x, =x, +x, from (7b)) so that system (7) is 


solved. Thus, system (7) can reasonably be said to be equivalent to the second 
order equation y” + 3y’ = u(t). 


Is there another second order equation of the form y” + k,y’ + kyy = u(t) that 
is also equivalent to (7)? For example, we might try to get a second order equation 
for x,. This question is handled using a precise definition of equivalence. Note that 
the equation y” + 3y’ = u(t) has the linear system form 


zZ= ? Ske = Huo (8) 


We expect that there is a transformation from R? to itself that transforms our 
_ original system (7) to the form (8). Since the differential equations are linear, we 
expect that the transformation is linear, say z = 7x. Differentiation then gives: 
z' = TAT "'z + Tou(t). . 
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Definition 1. The system x’ = Ax + bu(t) is linearly equivalent to the system 
z' = Ez + fu(t) if there exists a nonsingular matrix T such that 


TAT '=E, Tb =f. (9) 
Thus, (5) is equivalent to (2) if and only if there is a nonsingular T such that 
TAT" ' = P and Tb = d, where P and d are given in (3) and (4). 
In Example 1 we obtained a second order equation for the variable x,. If we set 
Z, =X, Z2 =X>, then z, =x, =x, —X,, so a transformation demonstrating the 
equivalence of (7) and (8) is given by 


0 1 
T= : 
| f, ad | 
The natural questions concerning existence, uniqueness, and computation of T 


arise. Before proceeding to answer these questions it is instructive to try to 
transform the following example to the form (2). 


Example 2. Let A be as in Example 1, but let b = [1 1]’. Show that this system 
cannot be transformed to the form (2). Hint: In Example 1 we knew P once we 
had the second order equation, but, in fact, we know P anyway because by 
similarity we know P’s characteristic polynomial. 


3. EQUIVALENCE AND THE COMPANION MATRIX P. System (2) is very 
special; we call it a companion system because P is a companion matrix defined by 
the characteristic polynomial A” + k,A"~' + k,A"™* + +) +k,_,A +k,, which 
is the same as the characteristic polynomial of any matrix that is similar to P. 
Example 1 shows that system (5) may be equivalent to a companion system, and we 
have seen two examples of (5) that are not equivalent to a companion system, 
namely, Example 2 and a two-dimensional system with diagonal A having a 


repeated eigenvalue. 


3.1 A Similarity Invariant. It is convenient to make the following definition. 


Definition 2. The vector x is a cyclic vector for the square matrix A if the n 
vectors x, Ax,..., A”~'x are linearly independent. 


In (2), d is a cyclic vector for P; one way to see this is by direct calculation of 


O : O 1 +« 
| d Pd P?d + ped) = : 


O 1 * * * * * 
1 * * * * * * 


which is nonsingular. Existence of a cyclic vector for a matrix is a similarity 
invariant. If A is similar to P and TAT~! = P, then A has a cyclic vector given by 
Td. 

The next proposition gives a useful condition that is equivalent to similarity 
between A and P. 


Proposition 1. [6, Theorem 3.3.15] The matrix A is similar to the companion matrix 


P of its characteristic polynomial if and only if the minimal and _ characteristic 
polynomials of A are identical. 
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Proof: Similar matrices have the same characteristic polynomial and minimal 
polynomial, and the minimal polynomial of a companion matrix P is the same as 
its characteristic polynomial [6, pp. 146-147]. Thus, if A is similar to P, then the 
minimal polynomial and the characteristic polynomial of A must be identical. 

On the other hand, if the minimal polynomial and characteristic polynomial of 
A are identical, then the Jordan canonical form of A must contain exactly one 
Jordan block for each distinct eigenvalue; the size of each Jordan block is equal to 
the multiplicity of the corresponding eigenvalue as a zero of the characteristic 
(minimal) polynomial of A. In this case, the Jordan canonical form of the 
companion matrix P has the same Jordan block structure as A, and hence it must 
be similar to A. Thus, A must be similar to P. i 


Proposition 1 makes it easy to construct examples of matrices that have (or do 
not have) cyclic vectors. The similarity condition holds in Examples 1 and 2, where 
the characteristic (and minimal) polynomial for A is ACA + 3). We conclude that 
there is some other obstruction in Example 2 to an equivalence with system (2), 
and the obstruction must involve the b vector. Thus, the problem with transform- 
ing Example 2 is related to the way the forcing function u enters the equations. 
We pursue this observation in Section 4. 


3.2 Uniqueness of the Transformation T. Assume that we have a nonsingular 7 
such that TAT"! = P and Tb = d. Then TAT 'd = TAb, and TA*b = TA*T"'d = 
(TAT ')*d = P*d for all k > 0. Nonsingularity of T implies that 


n = rank{d Pd P*d --- P"~'d| =rank[b Ab A*b «+ A"~'b]. 


Moreover, T is uniquely determined by its action on the basis defined by the 
vectors b, Ab,..., A”~'b. Thus, we have the following uniqueness result and 
necessary condition. 


Proposition 2. There is at most one nonsingular linear transformation, z = Tx, taking 
(5) to the companion form (2). Such a T exists only if 


rank [b Ab... A" 'b] =n. (10) 


Example 2 is explained by this result, because in that example we have 


1 0 
A= | : 4 

We return to Example 2 later for additional insight. Proposition 2 also explains 
why we cannot get a second order equation (1) for the variable x, in Example 1: 
the second order equation for y = z, must be a unique linear combination of the 
components of x. 

We now show that the criterion (10) is sufficient for there to be a nonsingular 
linear transformation T from (5) to (2). We also show how to construct T by a 
simple direct method. 


3.3 The Rank Condition (10) is Sufficient for Equivalence. Referring back to 
Example 1, the key in transforming (5) to (2) is to identify the variable z, that 
satisfies an equivalent n-th order equation (1). Note that z, = (first row of T)-x. 
Let + denote the first row of T. Since Tb = d = [0 0...0 1]/ and T4*b = P“d, we 
must have r-b = 0, 7-Ab = 0,...,7°:A” *b = 0, and 7: A”~'b = 1. Write this as 


t[b Ab... A"~'b] = [0...0 1] =d?. (11) 
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Now if we assume rank[b Ab... A"~'b] =n, there is a unique solution for 7 in 
(11). (Again, the crucial z, variable must be a unique linear combination of the x 
components.) What about the rest of 7? The form of the companion system 
requires that 


Z, = (second row of T)x =z, = 7:x' = 7: (Ax + bu) = TAx + Thu = TAx; 
therefore the second row of T is tA. Continuing in this way, the equations 
defining 7 and the form of the z system imply that 


2 
Pe) Oe), (12) 


7A"! 
We combine this argument with Proposition 2 as follows. 


Theorem 1. The system x' = Ax + bu(t) with x © R" can be transformed to the 
companion system, z' = Pz + du(t), by a nonsingular linear transformation, z = Tx, 
if and only if rank[b Ab... A"~'b] =n; in this case, T is uniquely defined by (12), 
where t is the unique solution of (11). 


Theorem 1 answers our original question. If the basic algebraic fact concerning 
the existence of a cyclic vector for the companion matrix of A is understood, then 
the situation regarding equivalence between (5) and (2) becomes transparent. 

Our original question got us to this point. But there is much more involved here, 
if we re-examine things. Think about varying the nonhomogeneous term in (5). 
What if we apply different input functions u(t)? To what extent can this affect the 
solutions of the system? 

We consider the question of varying the input u(t) in the next section. By doing 
so, we obtain an analytic, control-theoretic meaning of the rank condition in 
Theorem 1. 


4. CONTROLLABILITY. System (5) is often called a single-input system because 
the input function wu is scalar-valued rather than vector-valued. We show in this 
section that a natural concept of controllability for the single-input system (5) 
coincides with b being a cyclic vector for A. | 

In an elementary differential equations course the nonhomogeneous term in (1) 
is considered to be a fixed, specified function of t. But we now ask: What happens 
with the system dynamics as we change u? More specifically, to what extent can 
the motion of the state vector x(t) be influenced, starting from an initial state x, 
and using fairly arbitrary inputs, u(t)? The next definition describes ‘a concept of 
complete controllability of the state. Before stating this definition, we should 
specify a set Y of admissible input functions. The solutions of linear constant 
coefficient systems of differential equations are defined on the entire real line, and 
generally we want the same property for the inputs. However, for some questions, 
the inputs are restricted to an interval [t9, ¢,]. Thus, with an appropriate restriction 
of domain when necessary, we could consider several vector spaces of functions for 
the set Y, including piecewise constant, continuous, or locally integrable inputs. 
A real-valued function u(t) is locally integrable if 


[Plu(s)|ds < 00 
= | te 
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for each t, <t,. The set of locally integrable functions is the largest vector space 
of inputs for which (13) makes sense; therefore we assume our inputs are locally 
integrable. 


Definition 3. The linear system (5) is completely controllable if, given any Xo, 
x, € R", there exists a t, > 0 and a control function u(t), defined for 0 <t < te, 


such that the solution to (5) with initial condition x(0) = x, satisfies x(t;) = x,. 


The solution of (5) with x(0) = x, is 


x(t) = e'41 x, + foe Abus) as) (13) 
0 
where 
t? t* 
e4=7+tAt+—A? te +—AR tee 
Z| k} 


By the Weierstrass M-test, this series is absolutely and uniformly convergent for 
It} <t, for any finite t, [10, pp. 134-135]. The linear system (5) is completely 
controllable if for any given x, x, there is some t, and some locally integrable 
function u on 0 <t <f, such that 


Xp = e4( x, + [le *Abu(s) as) (14) 


It may be surprising that the solvability of (14) for arbitrary x, x, is determined by 
a purely algebraic criterion; the explanation lies with the Cayley-Hamilton Theo- 
rem: the matrix A satisfies p(A) = 0, where p(A) is the characteristic polynomial 
of A. The rank condition (10) is known as the controllability rank condition, and the 
matrix [b Ab... A”~'b] is called the controllability matrix, because of the next 
theorem. 


Theorem 2. The linear system x' = Ax + bu(t) in (5) is completely controllable if and 
only if rank[b Ab... A"~'b] =n. 


Proof: By the Cayley-Hamilton theorem, for each k > n, A* can be expressed as a 
linear combination of the powers A, A’,..., A” '. Let & denote the column 
space (range) of [b Ab... A”~‘b]. From the definition of the matrix exponential 
and the fact that & is a closed subspace of R”, we can conclude that the range of 
e °4b must lie in & for every s. Thus, the integral on the right side of (13) must lie 
in & for all t. Take x, = 0, so the states that are reachable from the origin in 
finite time, by means of some input u(t), must all lie within &. Thus, if the rank 
condition does not hold, then the system is not completely controllable because 
there are states that cannot be reached from x,. This establishes the implication: 
complete controllability = rank[b Ab... A"~'b] =n. 

Conversely, suppose rank[b Ab... A"~'b] =n. We must now show that (5) is 
completely controllable. Choose any finite time t, > 0, and consider the symmetric 
n Xn matrix 


M = i te sApple-5A" dy. 
0 


We first show that M is nonsingular, and then we show that nonsingularity of M 
implies complete controllability. So suppose that Mv = 0 for some v; then also 
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v' Mv = 0, and this implies that 


0 =v' Mv = [ite *4bbTe*4'vds = fs)’ ds, 

0 0 

where p(s) = v’e *4b. Since (y/(s))* is continuous and nonnegative, we conclude 
that w(s) = 0. It follows that 


(0) =v'b =0, w'(0) = —v™Ab = 0,..., W""(0) = +074"-1b = 0. 


Therefore v is perpendicular to &. By the rank assumption, we must have v = 0, 
and therefore M is nonsingular. Now take any two points x), x, in R", and de- 
fine the control u(s) = b’e~*4'x for 0 <5 < ty, where x remains to be chosen. 
The solution x(¢) with input u and initial condition x, has final point x, at time f, 
provided that x can be chosen so that 

A 


ee 
Xp = ef" ( Xo + 


tf 
joesbne as) = e't4(x, + Mx). 

0 
But e’4 is nonsingular because (e'/4)~' =e “4, and M is nonsingular, so 
x = M~'(e~'14x,— x). Thus, any x) can be steered to any x, in time f,, so the 
system is completely controllable. = 


Our proof that the controllability rank condition is sufficient for complete 
controllability follows an argument in [9, pp. 167-168] 

Let us illustrate both Theorem 2 and the idea of controllability by re-examining 
Example 2. 


Example 3. (Example 2 continued) The system is 


e-[F ae [He 


Note that A =0 is an eigenvalue of A, and b=[1 1]’ is a corresponding 
eigenvector, so the controllability rank condition does not hold. However, A is 
similar to its companion matrix. Using the T computed before and z = Tx we have 


the system 
,_ 10 1 1 
Z=|) _3le+ [of 


Differentiation of the z, equation and substitution produces a second order 
equation for z,: 

zi +3z,=3ut+u, 
which does not match (1) due to the wu’ term. One integration produces a first 
order equation 


z+ 3z, =3fuds +u, 


which shows that the action of arbitrary inputs u affects the dynamics in only a 
one-dimensional space. The original x equations might lead us to think that u can 
fully affect both x, and x,, but notice that the z, equation says that u has no 
affect on the dynamics of the difference x, —x, =z,. Only when the initial 
condition for z involves z,(0) = 0 can u be used to control a trajectory. That is, 
the inputs completely control only the states that lie in the subspace span[b Ab] = 
~ span{b} = span[1 1]’. Solutions starting with x,(0) = x,(0) satisfy x,(t) = x,(t) = 
fjuds + x,(0). One can steer along the line x, =x, from any initial point to any 


October 1999] FUNDAMENTAL CONTROL THEORY I 711 


final point x(t) =x,(¢,) at any finite time ¢, by appropriate choice of u(t). On 

the other hand, if the initial condition lies off the line x, = x,, then the difference 

Z, =X, — x, decays exponentially so there is no chance of steering to an arbitrarily 
given final state in finite time. 


When a system is completely controllable, there are generally many input 
functions that can implement the transfer from x, to x,. This flexibility can be 
exploited in some applications to optimize the behavior of the system in some way, 
for example by minimizing a measure of the cost of carrying out the transfer. In 
particular, if the cost of the control action is measured by the integral 


["lucs)P as, 
Lo 


then a control that minimizes this cost can be determined. For additional details 
on this problem for linear systems, see [1, pp. 102-105]. Optimal control theory is 
concerned with optimizing various performance indices of systems such as (5). 


5. OBSERVABILITY AND DUALITY. Suppose we have a system (5) for which a 
certain linear combination of the state components x; is directly measured, 
perhaps by some combination of instruments. We write the system and its mea- 
sured output as 
x’ = Ax + bu (15a) 
ae x: (15b) 
where c is a constant vector. The function y(t) is our known output. 
We now ask, when is c’ the first row of a transformation T to the companion 


system (2) where y is the dependent variable in (1)? If such a T exists, then T must 
have the form (12) with t = c’. We must also have Tb = d = [0...0 1]’. Thus, 


ea 
c'A 
rank T = rank =n. (16) 
ciA"-! 
In addition, since 7b = d, we have 
oa 0 bt 
c'A 0 bt At 
b=|]i]= Cn 3 (17) 
char] ie) beCAet yr 


You can check that the differential equation for z =[y y’... y"~] really is the 
companion form (2), by remembering that A satisfies its own characteristic 
polynomial: A” + k,A"-' +++ +k,_,A4+k,I = 0. 


n—-1 


Proposition 3. There exists a nonsingular T transforming (15a) to companion form 
(2) with z, = y = c’x, if and only if the rank condition (16) holds and (17) is satisfied. 
In this case, T is uniquely determined and is the matrix in (16). 


Note that the matrix in (16) has the same rank as the matrix [cA’c...(A’)"~!e], 
so that y = c’x satisfies (1) if and only if the system 


x’ =Alx + cu (18) 
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is completely controllable, by Theorem 2. Moreover, (17) shows that b’ is the first 
row of the transformation that takes (18) to companion form. 

The connection of Proposition 3 with the system (18) leads to a fundamental 
duality between complete controllability and the concept of complete observability. 
The rank condition (16) is known as the observability rank condition and the matrix 


c'A 

: (19) 
clA" 1 

is called the observability matrix for the system (15). The rank condition implies 


that the system state x can be reconstructed from knowledge of y, u, and their 
derivatives. Here is a basic definition. 


Definition 4. The system (15) is completely observable if, for any x) = x(0), there is 
a finite time ¢, > 0 such that knowledge of the input u(t) and output y(t) on [0, t, | 
suffices to determine x, uniquely. 

Definition 4 could be restated using only the zero input, u = 0. To see how a 
determination of x, is made when the observability rank condition holds, differen- 
tiate the output equation (15b) n — 1 times and set t = 0 to get 


y(0) cr 
y'(0) ciA 
. = -  |x9 + terms dependent on u. (20) 
y@-D(0) cl An} 


Under the observability rank condition, the coefficient of x) is nonsingular, and 
we can solve for x, in terms of y and wu and their derivatives. To illustrate, 
consider the system of Example 2 with output y = [10]x. In this case, (20) gives 


yO} _je] | | 0 
y(0)} |c7A nO | eTh |e 
_[ 1 9], | fo 
5 1 ito], | 


‘Therefore the system is completely observable. 


Theorem 3. The system (15) is completely observable if and only if the observability 
rank condition (16) holds. 


Proof: We have already shown the sufficiency of the rank condition (16). Now 
assume that complete observability holds; we must show that (16) holds. For the 


purpose of contradiction, suppose also that the observability matrix has deficient 
rank; then, there is a nonzero vector v such that 


v=0. (21) 


cl A"! 
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Now take x, =v and consider the output y = c‘’e’4v using input u = 0. Using 


(21) and the definition of the matrix exponential, the series expansion for y must 
have all coefficients equal to zero. Thus, y = 0; but this is also the output when 
x, = 0 under zero input. This contradicts the complete observability assumption. 

mi 


Motivated by the comments following Proposition 3, we define the dual system 
of (15) to be 


x' = A’x + cu(t) (22a) 
y= Da. (22b) 


Then the dual of the dual of a system is the original system. With this definition 
we can encapsulate the discussion thus far with the following classical 
duality statement. 


Theorem 4. The system (15) is completely observable if and only if the system (22a) is 
completely controllable. The system (15a) is completely controllable if and only if the 
dual system (22) is completely observable. 


6. FEEDBACK, STABILIZATION, OBSERVERS, AND DUALITY. A major theme 
in control theory is the use of feedback to modify the system dynamics to achieve 
some desired behavior, for example to stabilize an otherwise unstable equilibrium 
point. In this section we indicate some advantages of an equivalence with the 
companion system (2) with regard to these issues. We also present one additional 
consequence of duality. The considerations in this section help to indicate that 
much can be accomplished with the control of linear systems, and thus it is 
desirable to have an extension of the solution of the equivalence problem involving 
systems (2) and (5) to the case where (5) is replaced by a single-input nonlinear 
system. 


Definition 5. The linear system x’ = Ax is stable if all eigenvalues of A lie in the 
open left half-plane. 


From the theory of linear differential equations, it is known that all solutions 
x(t) —- 0 as t > if all the eigenvalues of A have negative real part. In this case, 
the equilibrium at the origin is asymptotically stable. 


Definition 6. In system (5), linear state feedback is specified by u = Kx where K is 
a real 1 X n matrix. The corresponding closed loop system is x' = (A + bK)x. 


Consider the companion form (2). Using linear state feedback, u = Kx, it is 
possible to assign eigenvalues arbitrarily to the resulting closed loop system, 
provided that the complex eigenvalues of A+ 5K occur in complex conjugate 
pairs. Specifically, by setting u = Kx =[—a, — a,_, ... —@,]x in (2) we get the 
closed loop system z’ = Pz, where P has the same form as P in (3) except the 
last row is now [—(k, + a,) —(k,_, + a,_,)... -(k, + @,)]. Suppose that 
m,,M,...,m, are the desired coefficients of the characteristic polynomial of the 
closed loop, z’ = Pz. With the k; known and the m, specified, then a, = m, — k,. 
_ Thus, the coefficients of the characteristic polynomial of A + bK may be chosen 
so that all its roots lie in the open left half plane. And the exponential rate of 
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convergence of z(t) to the origin can be increased, for example, by shifting the 
roots to the left in the complex plane. 

A system that is not stable might be made stable if modified by appropriate 
linear feedback. 


Definition 7. The linear system x’ = Ax + bu(t) is stabilizable if there exists a 
1 X n matrix K such that the linear system x’ = (A + bK)x is stable. 


Theorem 5. If x’ = Ax + bu(t) is completely controllable then it is stabilizable and 
the eigenvalues of x' = (A + bK)x can be assigned arbitrarily (provided that complex 
eigenvalues occur in conjugate pairs) by appropriate choice of K. 


Proof: We have discussed the proof only for the special case of a companion 
system. By complete controllability, there is a nonsingular T with z = Tx such 
that z’ = TAT 'z + Tbu is a companion system. Therefore the eigenvalues of 
TAT~! + TbK can be assigned as described, where u = Kz represents linear 
feedback for the companion system. Now the similarity 


TAT"! + TbK = T(A + bKT)T"! = T(A + bK)T~!; K=KT 


shows that the eigenvalues of A + bK can be assigned by appropriate choice of 
feedback u = Kx. a 


There is a concept dual to stabilizability that involves the state-to-output 
interaction of system (15). We give only a very brief discussion. 


Definition 8. System (15) is detectable if there exists an n X 1 matrix L such that 
the system x’ = (A + Lc")x is stable. 


Forming the matrix A + Lc’ corresponds to output feedback given by u = Ly = 
Lc'x. The eigenvalues for A + Lc’ are the same as those for A’ + cL’, which 
corresponds to state feedback u = L’ x in the dual system (22a). Thus a system is 
detectable if and only if the dual system is stabilizable. These are purely algebraic 
statements. An analytic interpretation of detectability derives from its implication 
that linear output feedback can be used to “detect” system trajectories asymp- 
totically through a construction known as an observer system. Specifically, consider 
the system 


€'=AE+ Bu- L(y -c’é) (23) 
where é is an auxiliary state that can be initialized at any vector €(0). The auxiliary 


state € is intended to approximate the true state x, and L, a so-called “output 
error” gain matrix, is to be chosen so that € approximates x. Define the error by 


C= X =€, 
The objective is to choose L so that e — 0 as t — ~. Now, subtraction of (23) from 
(15a) gives 
e =(A+Lc’ Je. 


Theorem 6. Jf the system (15) is detectable then L can be chosen in system (23) so 
thate =x — €->0 ast — ©, independently of the initial condition &(0). 


Some comments on this construction are in order. The observer system (23) is 
an alternative to computing the solutions of the system (15) with a direct numerical 
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method. By using the known data provided by A, b, and c, together with y and u, 
system (23) can be simulated numerically with a guarantee that the estimated state 
asymptotically reconstructs the true system state x for (15), independently of the 
initial €(0). If we were so lucky as to have &(0) = x(0), then the observer equation 
(23) implies that €(t) = x(t) for all t, a perfect estimate. You can think of (23) as a 
system with inputs u and y, and with output €, the desired approximation. The 
estimate & itself can be fed back to (15a), via u’= Ké, in place of the true state for 
purposes of stabilization of (15a), provided that (15a) is stabilizable. In other 
words, the eigenvalues of the closed loop system can be placed somewhere within 
the left half-plane even though only the output y is measured. Moreover, an 
important feature of this construction is that the controller (that is, the matrix K) 
and the observer (essentially the matrix L) can be designed independently while 
ensuring that the overall, interconnected observer /controller system is stable. To 
see this, use (15a) together with (23) to write the combined system for (x, €) as 


x A bK x 
Zi . 24 
a ea ssitcele oa) 
We can obtain the characteristic polynomial for this system by using the following 
_ similarity transformation: 
I olf A bK I 0|_[A+bK  bK 
—I J\|-Lce’ A+Lc'+bK\|I J 0 A+ Lc" | 
Thus, the characteristic polynomial of the coefficient matrix in (24) is the product 
of the characteristic polynomials of A + bK and A + Lc’. This means that K can 
be designed without regard to the fact that only state estimates will be fed back, 
and the observer error gain L can be designed without reference to the fact that 
the resulting state estimates are fed back for stabilization purposes. This indepen- 
dent design feature is often called the Separation Principle. 
Let us consider two examples illustrating stabilizability and detectability. 


Example 4. We return to Example 2 once more, and adjoin the output equation 
y =x,. Then the system coefficients are 


A=|74 ay b= ||. cT=[1 ol]. 


This system is both stabilizable and detectable, using the feedback matrix K and 
observer matrix L given by 


K=[-2 Ol], L=|_9), 


because A+ bK then has eigenvalues —2, —3, and A+ Lc’ has eigenvalues 
—2, —1. Other choices for K and L are also possible. 


Example 5. Stabilizability and detectability are not guaranteed. Consider the 
linear system with coefficients 


4={j “i b= ||, cT=[0 1]. 


In this case, any 1 X 2 feedback matrix K produces a closed loop matrix A + bK 
with zero as an eigenvalue; therefore, the system is not stabilizable. Also, any 
2X1 matrix L yields a matrix A + Lc’ with zero as an eigenvalue; therefore the 
system is not detectable. 
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7. A BRIEF NOTE ON EXTENSIONS. Let us briefly describe an extension of our 
discussion of the single-input systems in Sections 4—6 to the case of linear systems 
with multivariable input and output, | 

x’ = Ax + Bu(t) (25a) 

y=Cx, - (25b) 
where u € R”, y © R”, and thus B is n X m and C is p Xn. As noted before, 
the rank tests for controllability and observability allow for a statement of alge- 
braic duality between these concepts, once an appropriate dual system is identi- 
fied. The same principles extend to (25). 

Definition 3 (complete controllability) makes sense for the m-input,case; the 
admissible control functions are R”-valued functions u(t) such that every entry of 
u(t) is locally integrable. | 

The characterization of complete controllability in Theorem 2 directly carries 
over to (25a) with no change in the statement. In this case, the controllability 
matrix [B AB... A”~'B] has size n X nm, and the proof proceeds as before from 
(13). With careful attention to the dimensions involved, the same proof carries 
through; the M matrix is still n X n while w is 1 X m. 

Complete observability of the system (25) is defined exactly as in Definition 4, 
and the system (25) is completely observable if and only if the observability matrix, 


C 
CA 


CA"! 


which is now pn X n, has rank n. One checks that the proof of Theorem 3 carries 
through as before. 

The dual system of (25) is defined by 

x’ = A'x + C'u(t) (26a) 

y=Bx, (26b) 
with matrix dimensions determined, of course, by (25). Theorem 4, which docu- 
ments the algebraic duality of complete observability and complete controllability, 
is also valid when applied to (25) and its dual system. 

The extension of Theorem 5 to the case of m-input controllable systems can be 
based on the single-input result: see [13, pp. 49 — 51] for an accessible proof that 
essentially reduces the m-input case to the single-input case. 

Definition 7 and Definition 8 have straightforward extensions to the m-input 
and p-output cases. Theorem 6 on observer construction is easily seen to extend to 
(25); the extension is essentially notational. 

Once we move to linear systems with time-dependent coefficient matrices, 
additional technical issues arise in any extension of observability, controllability, 
and their duality, although several extensions have been accomplished. For exam- 
ple, several useful definitions of controllability for time-varying systems are possi- 
ble, all of which coalesce in the linear constant coefficient case to describe the 
same concept. These definitions may involve the initial time f), the particular 
initial state x) considered, and the time interval over which control action is 
to take place. The reader interested in these issues is invited to explore the 
references. However, let us give one further example to illustrate that time- 
varying systems require alternative approaches in order to describe controllability 
properties. 
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Example 6. [11] Consider the system 
1 0 b,(t) 
y= 0 2 x b(t) u. 


x,(t) e'{x10) i: [e~by(s)u(s) as 
x(t) = = 
x,(t) ex, (0) + [e-?*ba(s)u(s) as 


The general solution is 


If b, and b, are constant, then Theorem 2 ensures that the system is completely 
controllable. Suppose now that b,(t) = e’ and b,(t) = e*‘; the solution is then 


x(t) e'{ x10) rt fu(s) as 
x(t) = = 
x(t) e{24(0 + fiu(s) as} 


Thus, solutions that start on the line x, =x, at t = 0 always satisfy the condition 
x(t) = e'x,(t), and from this condition we can conclude that the system is not 
completely controllable according to Definition 3, for if x,(0) =x,(0), then the 
motion is confined to the first or third quadrant since x, and x, must have the 
same sign. In particular, the set of points reachable from the origin lies within 
those two quadrants. If we consider the controllability rank condition in a point- 
wise manner, that is, if we consider the following matrix for each time instant f, 


e! e! 
[ B(t) AB(t)| = e2t 4 er! ? 
we obtain a nonsingular matrix. This example shows that a pointwise interpretation 
of the controllability rank condition of Theorem 2 does not lead to a satisfactory 
criterion for complete controllability of a time-varying linear system. 


8. FURTHER READING. Three major themes in control theory (and in this 
article) involve (i) the input-to-state interaction: controllability, (ii) the state-to-out- 
put interaction: observability, and (iii) transitions between different representations 
of a dynamical system. We have tried to illustrate those themes in a discussion of 
an equivalence problem for single-input linear systems. 

Two comprehensive texts that focus on time-invariant linear systems are [7] and 
[9]. For more on multivariable input and output, and time-varying linear systems, 
see [1], [2], [3], [11], [12], and [13]. The linear algebra text [4] provides a mathemati- 
cian’s view of some fundamental results of control-theoretic interest. The presenta- 
tion of linear control theory in [13] is nicely unified around the concept of invariant 
subspace. Additional perspective on linear systems theory from the mathematical 
point of view can be obtained from [5]. 
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The Education of a Pure Mathematician 


Bruce Pourciau 


Characters: Stu and Denton, mathematics majors; Integrity Jane, philosophy major 
and auditor from Hell; Professor Class, professor of mathematics. 


Setting: a university classroom, during the first days of a course called Foundations 
of Analysis, taught by Professor Class. 


WEDNESDAY, THE FIRST DAY 


PROFESSOR CLASS Good morning. I hope everyone enjoyed a rewarding and 
relaxing summer. I’m pleased to see so many familiar names on my class list— 
except for Ms Integrity Jane... . Is she here? I’m sorry, how do you pronounce 
your last name? 


INTEGRITY Just call me Integrity. You probably don’t recognize me, because I’m a 
philosophy major. I’m just auditing. 


PROFESSOR CLASS That’s an unusual name. 
INTEGRITY Tell me about it. 


PROFESSOR CLASS Well, welcome Integrity and welcome everyone to the Founda- 
tions of Analysis, also known fondly around here as “The Education of a Pure 
Mathematician”. We'll be covering logic, set theory, the real numbers, and the rest 
of the topics listed on the syllabus I’m handing out. As we work through these 
topics, we will come to appreciate the roles of definitions, axioms, and logical 
deduction, and learn how to read, understand, and write formal proofs. In a way, 
this course is a kind of ceremonial rite of passage, for in passing through it, we 
absorb how to think and act like pure mathematicians. Everyone has a copy of the 
textbook? Good. Then let’s begin. Yes, Integrity? 


INTEGRITY Before we get started, I’d like to ask a favor of you and the rest of the 
class. After doing some work in the philosophy of science last year, I signed up to 
audit this course because I thought the foundations of analysis would offer a 
paradigm for how scientists should build a field of rational and unbiased inquiry. 


PROFESSOR CLASS I think you’ve come to the right place. If you can’t find rational 
and unbiased inquiry in mathematics, where can you find it? 


INTEGRITY Exactly. So I was wondering, what if we, all of us together, agreed on a 
short list of basic principles for the construction of a field of scientific inquiry. 
Then, as the course goes along, we can keep track of how consistent we’re being 
with our basic principles. Would this be OK with everybody? 
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STU Sure, fine with me, why not. 
DENTON Sounds like fun. 


PROFESSOR CLASS I think it’s a splendid idea. Does anyone object? No one? Looks 
like you have a deal, Integrity. 


INTEGRITY After giving this some thought last night, I even have some possible 
principles to suggest. 


PROFESSOR CLASS Excellent. Why don’t you write them on the board, over to the 
side there, and we’ll discuss them. 


INTEGRITY All right. Here they are: 


Some Possible Principles 
for the Construction of a Field of Scientific Inquiry 


M Know what something means before you ask if it’s true. 
A Build in no clearly unwarranted assumptions. 


S Move from the simple to the less simple. 


PROFESSOR CLASS Only three? 


INTEGRITY I thought about others, as well as some variations, but these three 
struck me as more basic, less open to reasonable objections. For example, the 
variation of Principle A, “Make no clearly unwarranted assumptions’, doesn’t seem 
to work, since we often test a claim—that the earth is flat, thatx* = 2 for some 
rational number, or whatever—by assuming its truth temporarily in order to study 
its consequences. But this is a far cry from assuming its truth permanently, which 
would build in an unwarranted assumption, turning the assumption into a given 
that could influence, even determine, the shape of further inquiry. Also, these 
principles obviously aren’t supposed to be sufficient or anything. I’m only propos- 
ing them tentatively as rules that should be followed as we put together any 
rational and unbiased field of scientific inquiry. At the very least, you would think 
that a scientist trained in a field of inquiry that violates some of these principles 
ought to be aware of this fact and be able to defend the violations. 


DENTON Aren’t they just common sense, though? 


STU Yeah, I was hoping we’d get some interesting arguments out of this, but these 
principles seem spineless. Who would violate them? 


PROFESSOR CLASS In any case, Ill box them in and write “save” over here so they 
don’t get erased. And speaking of obviously correct principles, let’s begin our 
course with a few lectures devoted to formal logic. 


INTEGRITY Before we do any mathematics? 


PROFESSOR CLASS Sure. It seems only reasonable to review the general rules of 
correct thought before we apply them to the particular area of mathematics. Now 
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then, for us a statement will be a sentence that can be labeled true or false. In 
formal logic we study the truth values of complex statements that we learn how to 
make in precise ways from simpler statements. For example, we... Yes, Integrity? 


INTEGRITY I’m sorry to interrupt, but ’m worried that we might already be 
violating Principle A if we continue. 


PROFESSOR CLASS How’s that? 


INTEGRITY Well, how can we be sure that logic applies to mathematics before we 
do any mathematics? Wouldn’t that be an unwarranted assumption? I realize it 
may seem odd to suggest that formal logic might not preserve truth when applied 
to mathematical assertions, but still... . 


DENTON Be serious. Logic isn’t up for debate. It just is. 


INTEGRITY I am serious. Logic deals with statements, that is, sentences that must 
be true or false, independently of whether we can know them to be true or false. 
- But until we understand the meaning of mathematical assertions, their particular 
character and what they’re about, how can we know whether it’s appropriate to 
assume that they are always either true or false? Putting it this way, it looks as if 
we're going against Principle M too. 


PROFESSOR CLASS Formal logic goes all the way back to Aristotle. For over two 
thousand years, we have never found logic to conflict with our experience in the 
world around us. Of course this is hardly surprising, since formal logic merely sets 
out and studies the self-evident laws of correct reasoning. It deals with formal 
manipulations that preserve truth, no matter what the meaning of the statements. 
So it’s prior to every science, including mathematics. 


INTEGRITY Is it prior to quantum mechanics, for example? I remember from my 
philosophy of science class that some sort of “quantum logic” may fit the quantum 
world better than classical logic.' Anyway, the point is, what if the meaning of a 
mathematical assertion precludes its being regarded as always true or false inde- 
pendently of our knowing which? Then formal logic, and perhaps some of the 
procedures it sanctifies (such as the Law of the Excluded Middle) would not 
necessarily apply. After all, the world around us is finite, while mathematics is 
filled with infinite processes and structures. Isn’t it unjustified at this point to 
assume that formal logic, which seems to work beautifully in this finite world, must 
necessarily also work in the infinite world of mathematics? 


PROFESSOR CLASS We know it works in mathematics. It’s worked perfectly for 
centuries. 


INTEGRITY But perhaps only because classical logic was presupposed in that 
mathematics, just as you were about to presuppose it here. How could logic not 
work in a mathematics where logic—and in particular the assumption that asser- 
tions must be true or false—was built into it from the start? How can we ask 
whether mathematical assertions are always true or false, until we know the 
meaning of mathematical assertions? When we use a logic that takes this 
“bivalence” as given, before we know what mathematical assertions are about, we 
are in clear violation of both Principle A and Principle M. 
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DENTON It seems to me that bivalence is just the formal reflection of something 
we all believe: that mathematical assertions somehow embody “eternal truths”. 


INTEGRITY I believe this too, but we should not allow this sort of “religious faith” 
to commit us to certain types of reasoning in mathematics, ahead of understanding 
the meaning of mathematical assertions.° 


DENTON Hogwash. Nothing could be more clear than that bivalence applies to 
mathematical assertions. Take the Riemann Hypothesis. Either all the nontrivial 
complex zeros of the zeta function lie,on the line o = 1/2 or there are some that 
don’t. The Riemann Hypothesis is either true or false, whether we can prove it or 
not. 


INTEGRITY To repeat my mantra: you cannot know this for sure until you first 
decide on the meaning you wish to assign to mathematical assertions. That’s 
Principle M. I think you are being deceived by metaphors taken literally, by talk 
“about complex zeros of the zeta function” that you interpret as being literally 
about mathematical objects that exist independently of us.* Your “certainty” that 
the Riemann Hypothesis must be true or false, independently of human knowl- 
edge, therefore rests on uncertain metaphysical speculation. 


STU This feels backwards. If we throw logic out, how will we know if our thinking 
is correct? And how can we really throw it out anyway; it’s built into our language. 


INTEGRITY You're not saying that purely Jinguistic structures should determine the 
validity of mathematical structures, are you?° Any apparently real content in such a 
mathematics could turn out to be an illusion created by language. And if you 
accept classical logic as given, so that the idea of calling the validity of that logic 
into question becomes unintelligible, then you could even be trapping mathematics 
in this fantasy world: you might be fixing the legitimate modes of inquiry in ways 
that would prevent mathematicians from ever discovering that what had been 
taken as given might actually be unreliable!® Surely this would be an intolerable 
situation. | 

Look, I know it seems awfully hypothetical—I mean, really, what are the 
chances that after we sort out the meaning of mathematical claims, we’ll find that 
formal logic doesn’t apply—but it’s at least a possibility, isn’t it? 


PROFESSOR CLASS Strictly speaking, I think Integrity’s point—that mathematics 
should precede logic—is well taken, for the transformations that preserve the 
truth of mathematical assertions could conceivably depend on the meaning we 
assign to these assertions. And strictly speaking, we do not need to formalize logic 
as a check on our reasoning as we go on from here. In individual cases, we can still 
think carefully and clearly about our assumptions and procedures to check whether 
our argument is correct. Common sense tells us that an argument so intricate that 
it cannot be checked informally, cannot be checked formally either.’ So let’s skip 
our description of formal logic, for the moment. We can come back to it later. 

Why don’t we move on then to an informal description of set theory. All of 
mathematics rests ultimately on set theory, in the sense that every true statement 
in mathematics can be reduced in principle to a statement about sets that can itself 
be derived from the axioms of set theory. 


DENTON If sets are so basic, why not give us more than an “informal” description? 
This is a foundations course, after all. Give us the real stuff; we can take it. 
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PROFESSOR CLASS I appreciate your enthusiasm, Denton, but taking up the axioms 
seriously would really take a big bite out of our term. As a compromise, though, 
let’s write down some of the axioms®—they’re called the Zermelo-Fraenkel Axioms 
—and we can talk about them. 


AXIOM SCHEMA OF COMPREHENSION For any property P(x) of x and any A, there 
is some B with x € B if and only if x € A and P(x) holds. 


AXIOM OF PAIR Given any A and B, there is a C such that x © C if and only if 
x=Aorx=B. 


AXIOM OF INFINITY An inductive set exists. 


AXIOM SCHEMA OF REPLACEMENT Suppose P(x, y) is a property such that for 
every x there is a unique y that makes P(x, y) hold. Then for every A there is 
some B such that for every x € A there is some y € B that makes P(x, y) hold. 


INTEGRITY Shouldn’t we expect axioms to be self-evident? Or at least simpler than 
what we derive from them? 


PROFESSOR CLASS Well, these axioms become more familiar and plausible the 
more you work with them. This is even true when we write the axioms more 
rigorously. The replacement scheme axiom, for instance, could have been written 
this way: 


Given any formula @ with free variables among x, y, A,W1,.++5Whs 
VA Vw,,...,w,[Wx € Ad! yb > FY Vx © A Fy EC YO 


INTEGRITY Hm. Presumably you must define the positive integers in terms of these 
axioms? | 


PROFESSOR CLASS Yes, of course. 


INTEGRITY But this is an obvious violation of Principle S! Surely we should not 
define something which is already clear, natural, and immediate, such as the 
positive integers and mathematical induction,’ in terms of something that is far 
less self-evident, such as these Zermelo-Fraenkel axioms.'’ Let’s put it to a class 
vote. How many of you find the positive integers and mathematical induction clear, 
natural, and obviously correct? How many feel the same way about these axioms? 


PROFESSOR CLASS Obviously I don’t disagree. It’s plain that we have violated 
Principle S. But most mathematicians believe there are very good reasons for 
starting with the Zermelo-Fraenkel axioms rather than the positive integers. These 
axioms have given mathematics a solid foundation for many decades. Integrity, you 
have another comment? 


INTEGRITY Yes, I’ve thought of a second objection. The axioms of set theory, if 
taken to be true, must be regarded as meaningful, for otherwise we cross Principle 
M. But to the extent that the axioms have meaning, they appear to commit us to 
some sort of Platonic conception of mathematical existence. And certainly the 
assumption that mathematical objects enjoy this kind of metaphysical existence 
must be seen as an unwarfanted assumption, a matter of faith rather than 
evidence."' So we have a violation of Principle A as well. 
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PROFESSOR CLASS Of course most mathematicians do find some version of Platon- 
ism congenial.’ 


INTEGRITY As do I. But should we adopt a philosophy because we find it sympa- 
thetic, because, in its congenial way, it tells us what we want to hear? Or should we 
look for a philosophy that provides secure support for the foundations of mathe- 
matics? How can we ever feel secure if we base mathematics on the unwarranted 
assumption, on our private belief, that mathematical assertions refer to some 
objective reality? Even if we could prove the axioms of set theory were consistent 
—and I’ve heard that we can’t—we wouldn’t necessarily be able to construct 
a model."* 


‘PROFESSOR CLASS I’m beginning to agree with Integrity that taking the set theory 
axioms seriously leads us into conflicts with not only Principle S but also either 
Principle A or Principle M. I’m also beginning to think that these three principles 
are not as spineless as we thought. However, I still feel that the principles reflect 
common sense, and that they should guide the construction of any field of rational 
and unbiased scientific inquiry. So let’s continue to keep track of our violations, as 
well as what these violations tell us about our approach to mathematical inquiry. 
For now why don’t we content ourselves with the following informal treatment 
of sets... . 


FRIDAY 


PROFESSOR CLASS I’m pleased to see everyone’s still with us, after the starts and 
stops we had on Wednesday. Today should be smoother. Normally in this course I 
first introduce the real numbers axiomatically and only later go through the actual 
construction of the reals. But I doubt that Integrity Jane would be able to suspend 
her disbelief long enough for me to finish the axiomatic approach; so I have 
decided to give the construction now. 

We start by defining each individual positive integer as follows: 


1={@}, 2={9,1} =1U {1}, 3 = {G,1,2} =2 vu {2} 
and so on. (I can see you waving, Integrity, but let me continue for a minute.) To 


define the set N of positive integers, we use the Axiom of Infinity to ensure the 
existence of at least one set S satisfying the following two conditions, 


(a)les 
(b) For every x, x € S implies x U {x} € S, 


and then let N be the intersection of all sets satisfying (a) and (b). It is then simple 
to see that the Peano Postulates hold for N, including of course the Principle of 
Induction.” 3 

Now Integrity has been waving her hand and shaking her head, because I guess 
we can all see violations of Principle S. 


INTEGRITY Yes. I think this development seems formal and pretty, yet somehow 
empty, as if the desire for empirical meaning had been lost.’ It’s a terrible 
violation of Principle S, for, again, the positive integers and mathematical induc- 
tion strike us as far more immediate and clear than set theory based on the 
Zermelo-Fraenkel axioms. Why don’t we take the positive integers and their 
self-evident properties as given and build up mathematics from there? Can’t we 
do that? 
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PROFESSOR CLASS Perhaps we could, Integrity, but what I’m describing has been 
found to be a precise and elegant way to define not just the positive integers, but 
also the real numbers. So let’s push on. At this point, the rational numbers can be 
defined easily and their field and order properties checked. Consult your text for 
the details. Now to define the real numbers, we set up an equivalence relation in 
the collection of all Cauchy sequences of rational numbers, 


(a,) = (b,) if (a, — 5,) converges to 0 in the rationals, 


and then we call the resulting equivalence classes real numbers. 


INTEGRITY Can I ask why you put the Cauchy sequences into equivalence classes? 
Why not just say a real number is a Cauchy sequence of rationals and that two real 
numbers (a,) and (b,) are equal provided (a, — b,) converges to 0 in the 
rationals? 


PROFESSOR CLASS Most mathematicians find an equality based on identity fits 
their Platonic sympathies better than an equality based on a convention, as you 


propose. !° 


STU On the other hand we don’t really lose anything, apart from some unnecessary 
abstraction, if we drop the equivalence classes, do we? After all, no one has a 
problem writing 1/3 = 2/6 to mean, not the identity of the fractions, but that an 
equivalence relation is satisfied. 


PROFESSOR CLASS Your point is well taken, Stu and Integrity. But to continue, we 
can now introduce operations and an ordering and verify that our set of real 
numbers forms an ordered field. We’ll do some of this work during our next class, 
on Monday. At that time we will also prove that our construction has the following 
basic 


COMPLETENESS PROPERTY Every bounded, nonempty set S of real numbers has a 
least upper bound. 


INTEGRITY And by “has” you mean... 


PROFESSOR CLASS That some real number b exists that is a least upper bound 
for S. 


INTEGRITY I guess I’m just not clear on what meaning you are giving to “b exists”. 
STU It’s totally clear! It means that there is such a real number D. 
DENTON In other words, the set of all least upper bounds is not empty. 


29 66 


INTEGRITY But “has a least upper bound”, “‘a least upper bound exists’, ‘“‘there is 
a least upper bound”, “the set of all least upper bounds is not empty’—these are 
all synonymous expressions. They don’t explain the meaning at all. Do you mean 
that you possess a method that specifies a b that works? 


PROFESSOR CLASS I guess that would depend on what you mean by “possess’’, 
“method”, and “specifies’’. 
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INTEGRITY Well, suppose we consider for simplicity a less general completeness 
property—that every bounded sequence of rationals has a least upper bound 
among the reals—and ask whether we could write a program that, given any such 
sequence, would compute rational approximations to the least upper bound, to 
within any desired tolerance. 


PROFESSOR CLASS Ahh... 


INTEGRITY I don’t believe that we can write such a program. I was thinking about 
this last night, and it seems that applied to any infinite sequence of Os and 1s, this 
program either would prove that every entry vanishes or would exhibit an entry 
equal to 1. Most of the well-known unresolved problems of mathematics—the 
Riemann Hypothesis and the Twin-Prime Conjecture among them—could be 
solved by such a powerful program. No program of this scope exists, and surely no 
one believes one will ever be written.'” 


PROFESSOR CLASS This is really very interesting, Integrity. If we could have written 
this program, we could have said, to give it a name, that the least upper bound 
exists constructively. But it doesn’t exist constructively. That’s your point? 


INTEGRITY Yes, but what I’m really worried about is what sort of meaning you can 
give to “b exists” when constructive existence has been ruled out. Is there anything 
other than some kind of metaphysical existence left?!® 


DENTON I’m confused. What’s the problem? The number b still exists; the set of 
all least upper bounds is still nonempty. Whether b exists constructively or not is 
only an interesting side question. 


INTEGRITY The question is, what do you mean when you claim that “b exists’. 
We must be clear on meaning before we can decide truth. That’s Principle M. 


PROFESSOR CLASS I suppose we mean that it is false that every x in the reals R 
fails to be a least upper bound for S. 


INTEGRITY OK, but what then is the meaning of this new statement. It doesn’t 
explain the meaning of an assertion A to say that A means that B is true, and B 
is true means that C is true, and so on. At some point, we have to stop and give 
the meaning of one of these statements on its own terms. Now whatever meaning 
the statement “It is false that every x in R fails to be a least upper bound for S” 
may have, that meaning must reside in the conditions, defined by the statement 
itself, that allow us to say it is true. But these conditions are plainly not conditions 
that we, in general, can recognize as being true when in fact they are true. We just 
do not have the capacity to check each x in R to see whether it fails or not. The 
truth conditions, where any meaning must be lurking, therefore lie beyond us, 
untestable, beyond our experience and our consciousness. So how could we ever be 
said to have acquired or formed any understanding of what it takes for such a 
statement to be true, that is, any understanding of the meaning of the statement? 


DENTON This is getting too heavy for me. 


STU Can we do some mathematics now, please? 
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INTEGRITY Worse, there is no way for us to manifest or communicate whatever 
knowledge of the meaning of this statement we might claim to possess. And surely 
it can’t be meaningful to claim that we have knowledge of something, even implicit 
knowledge, if we cannot, in some circumstances at least, reveal that knowledge.” 
Do you see what I mean? 


PROFESSOR CLASS I’m beginning to, yes. And so... 


INTEGRITY And so it appears that in general the statement “b exists” has no clear 
meaning, unless we take existence to be constructive. 


DENTON Professor? 


INTEGRITY I see only two ways out, and they’re both bad. On the one hand, in a 
brazen violation of Principle A, you could posit the existence of a being with 
infinite powers, a being who can actually perform the infinite, even uncountable, 
searches required to give (nonconstructive) assertions such as “a least upper bound 
b exists” some meaning, some sharable, factual content.*? Of course, you buy this 
meaning at a steep metaphysical price. 


DENTON Professor Class? 


INTEGRITY On the other hand, as a second fall-back position, you could claim that 
in fact the assertions of mathematics in general have no meaning, that in the end 
doing mathematics consists of manipulating meaningless strings of symbols. But 
then Principle M forces you to give up truth as well. This strikes me as falling 
down, rather than falling back, for this position makes our cherished mathematics, 
not an inquiry into “eternal truth’, but a meaningless, formal game. And I’m sure 
you don’t see yourself as having taught generations of students a meaningless 
game. 


DENTON Professor Class, are you all right? 


INTEGRITY I hate to say it, but this whole development of analysis has a formal 
beauty that is hollow and meaningless at the core. It just lacks—I don’t know what 
to call it—perhaps integrity is the right word.” 


DENTON Professor! 


PROFESSOR CLASS Yes, Denton, I’m fine. I was just... lost in thought, I guess, and 
feeling a little strange.*” Look, I know it’s not the end of the hour, but why don’t 
we quit early today, and [ll see you on Monday. 


MONDAY 


PROFESSOR CLASS In light of the serious questions that have come up in our class, 
courtesy of Integrity’s initial proposal and her persistence in carrying it out, I felt 
driven over the weekend to think through the attitude I have always had (you 
could call it the classical attitude) toward mathematical existence and mathemati- 
cal truth. It seems to me that the foundations of classical analysis have fallen apart 
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under the gentle prodding of our three innocent-looking principles of inquiry. 
Either we must give up one or more of those common sense principles, or we must 
build up the foundations of analysis along different, perhaps more constructive, 
lines. Stu? 


STU You know, for a while I really enjoyed sitting on the sidelines, watching the 
philosophical dispute here in class. But this is a mathematics class. Let’s do some 
mathematics! A philosophical discussion concerning the nature of mathematical 
existence may be fun, and the position we take can certainly influence our 
understanding of mathematical assertions, but it can hardly have any relevance for 
the doing of mathematics. Proofs are still proofs; theorems are still theorems.” 


PROFESSOR CLASS If that were the case, Stu, I would have found Integrity’s 
questions less disturbing than I have. The truth is that your philosophical stance 
really does matter, and this has been known since Brouwer’s dissertation in 1907. 
The classical and constructive positions on mathematical existence lead to two 
different kinds of mathematics: different procedures are seen as legitimate, differ- 
ent proofs are seen as convincing, and different assertions are seen as theorems. 
Certain classical statements are not even intelligible from the constructive point of 
view! Some of you have read Kuhn’s The Structure of Scientific Revolutions? You 
probably thought that Kuhn’s ideas couldn’t apply to mathematics, but in fact I 
would say that the incommensurability in a shift from classical mathematics to 
constructive mathematics is as deep if not deeper than in any paradigm shift in 
physics or chemistry.** According to Kuhn, during a scientific revolution (and here 
I’m quoting) “the scientist’s perception of his environment must be re-educated —in 
some familiar situations he must learn to see a new gestalt. After he has done 
so the world of his research will seem...incommensurable with the one he had 
inhabited before.”” Well that pretty much describes what happened to me this 
weekend, except that Kuhn’s bloodless account doesn’t tell you how completely 
disorienting and yet thrilling the process can be. 

At the library on Saturday, I checked out the book Foundations of Constructive 
Analysis by Errett Bishop. Starting with the positive integers and their self-evident 
properties, he develops a natural and constructive version of mathematical analysis 
that appears to be consistent with our principles A, S, and M. It’s what Brouwer 
should have done, if he’d been serious about selling his intuitionist program to the 
classical mathematicians.” Though it’s out of print,*’ I received permission to copy 
the early chapters. Pass these copies around, please. This is your new text. Much of 
it will look familiar, but beware, it’s a starkly different world: truth and construc- 
tive proof are one (so there’s no such thing as an “unknowable truth’), mathemat- 
ics precedes logic, and classical logic (in some cases) fails to preserve truth. In this 
world, a classically correct description of an integer—for example, that m is 0 if 
the Riemann Hypothesis is false and 1 if it’s true—can become so much empty, 
meaningless talk, for to Bishop every integer can be converted in principle to 
decimal form by a finite, purely routine, process.** And every mathematical asser- 
tion ultimately reduces to a report, that if we make certain (perhaps hypothetical) 
computations within the positive integers, then we shall get certain results.*? From 
the constructive standpoint, an assertion is true only when we are in a position to 
assert it, and false or absurd when being in a position to assert it would give rise to 
a contradiction. We can no longer say that every mathematical assertion is true 
or false, because clearly, for many assertions A, we are in no position to assert A 
nor in any position to assert that A can never be asserted. And we can no longer 
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rely on proofs by contradiction, because knowing that “It is absurd that A is 
absurd” does not imply that we can assert A, for it does not imply that can 
necessarily effect the construction required to assert A.” 


Well, everybody ready? We’re starting the course over. We begin with the 


positive integers: 1,2,3,... 


Ne 


Go 


ea 


Oo CO IN 


10 


NOTES 


See [1, p. 320]. 

“Concerning the grounds for accepting logical laws... any ‘justification’ of such laws can be given 
only in terms of the adequacy of the language in which they are [embedded] to the specific tasks for 
which that language is employed... . Under the pressure of factual observation and norms of 
convenience familiar language habits may come to be revised; [so] the acceptance of logical 
principles as canonical need be neither on arbitrary grounds nor on grounds of their allegedly 
inherent authority, but on the ground that they elecuvely achieve certain postulated ends.” Ernest 
Nagel in [1, p. 320] 

“Tt seems to me that to clarify the sense of your [claim] you must again refer to metaphysical 
concepts: to some world of mathematical things existing independently of our knowledge... . But I 
repeat that mathematics ought not to depend on such notions as these. In fact all 
mathematicians ...are convinced that in some sense mathematics bears upon eternal truths, but 
when trying to define precisely this sense, one gets entangled in a maze of metaphysical difficulties. 
The only way to avoid them is to banish them from mathematics.” A. Heyting [11, p. 3] 

The contemporary mathematician’s use of language “seems to force us to choose between what are 
in fact two metaphorical descriptions of the manner in which pure mathematical knowledge is 
acquired: discovery or creation. And it strongly compels us to accept that the ‘correct’ answer is 
‘discovery’ and not ‘creation.’ ...one will then be drawn almost immediately into a completely 
Platonistic conception... . However...so long as we do not fall for the idea that talk ‘about 
statements’ and ‘about answers’ must be taken literally as being about ‘things’ that stand in a certain 
relationship to us... there is no choice to be made.” Gabriel Stolzenberg [17, p. 244] 

“We can, after all, ask: What does it mean for a set to exist if it can perhaps never be defined? It 
seems clear that this existence can only be a manner of speaking, which can only lead to purely 
formal propositions—perhaps made up of very beautiful words—about objects called sets. But most 
mathematicians want mathematics to deal, ultimately, with performable computing operations and 
not to consist of formal propositions about objects called this or that.” Thoralf Skolem in [10, p. 300] 
“Suppose that a... mathematical construction has been carefully described by means of words, and 
then, the introspective character of the mathematical construction being ignored for a moment, its 
linguistic description is considered by itself and submitted to a linguistic application of classical 
logic. Is it then always possible to perform a languageless mathematical construction finding its 
expression in the logico-linguistic figure in question? A careful examination reveals that ...with 
regard to the principle of the excluded third, except in special cases, the answer is in the negative. : 
L. E. J. Brouwer in [9, p. 236-7] 

[17, p. 225] 

[4, p. 5] 

Taken from [12] 

“(Mathematical induction], inaccessible to analytic proof and to experiment, is the exact type of the 
a priori synthetic intuition. ... Why then is this view imposed upon us with such irresistible weight of 
evidence? It is because it is only the affirmation of the power of the mind which knows it can 
conceive of the indefinite repetition of the same act, when the act is once possible. The mind has a 
direct intuition of this power... .” H. Poincaré in [1, p. 388] 

“Set-theoreticians are usually of the opinion that the notion of integer should be defined and that 
the principle of mathematical induction should be proved. But it is clear that we cannot define and 
prove ad infinitum; sooner or later we come to something that is not further definable or provable. 
Our only concern, then, should be that the initial foundations be something immediately clear, 
natural, and not open to question. This condition is satisfied by the notion of integer and by 
inductive inferences, but it is decidedly not satisfied by set-theoretic axioms of the type of Zermelo’s 
or anything else of that kind; if we were to accept the reduction of the former notions to the latter, 
the set-theoretic notions would have to be simpler than mathematical induction, and reasoning with 
them less open to question, but this runs entirely counter to the actual state of affairs.” Skolem in 
[10, p. 299] : 
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11 The axioms of set theory “if interpreted as meaningful statements, necessarily presuppose a kind of 
Platonism, which cannot satisfy any critical mind and which does not even produce the conviction 
that they are consistent.” K. Gédel, as quoted in [9, p. 99], in 1933. 

12 “Jt seems to me that no philosophy can possibly be sympathetic to a mathematician which does not 
admit, in one manner or another, the immutable and unconditional validity of mathematical truth. 
Mathematical theorems are true or false; their truth or falsity is absolute and independent of our 
knowledge of them. In some sense, mathematical truth is part of objective reality.” G. H. Hardy in 
[8, p. 1246]. 

13 “Suppose we have in some way proved, without thinking of any mathematical interpretation, that a 
logical system constructed from some linguistic axioms is non-contradictory... . If we then also find 
a mathematical interpretation of these axioms, does it follow...that such a mathematical system 
exists? But that has never been proved... .” Brouwer in [19, p. 266-7] 

14 Taken from [2, p. 79] : 

15 “[A] feeling for reality ...ought to be preserved in even the most abstract studies.” Bertrand Russell 
[16, p. 169] 

16 [3, p. 13], [4, p. 12], [5, p. 15] 

17 [3, p. 4-5], [5, p. 7-8] 

18 “If ‘to exist’ does not mean ‘to be constructed,’ it must have some metaphysical meaning. It cannot 
be the task of mathematics to investigate this meaning or to decide whether it is tenable or not. We 
have no objection against a mathematician privately admitting any metaphysical theory he likes, but 
Brouwer’s [and more generally the constructive] program entails that we study mathematics as 
something simpler, more immediate than metaphysics, [as something where] ‘to exist’ must be 
synonymous with ‘to be constructed.”” Heyting [11, p. 2] 

19 [7, p. 225] 

20 “Classical mathematics concerns itself with operations that can be carried out by God... . You may 
think that I am making a joke... by bringing God into the discussion. This is not true. I am doing 
my best to develop a secure philosophical foundation, based on meaning rather than formalistics, 
for current mathematical practice. The most solid foundation available at present seems to me to 
involve the consideration of a being with non-finite powers—call him God or whatever you will—in 
addition to the powers possessed by finite beings.’ Errett Bishop [4, p. 9] 

21 “When I attempt to express in positive terms that quality in which contemporary mathematics is 
deficient, ...I1 keep coming back to the term ‘integrity.’ Not the integrity of an isolated formalism 
that prides itself on the maintenance of its own standards of excellence, but an integrity that seeks 
common ground in the researches of pure mathematics, applied mathematics, and ... physics; that 
seeks to extract the maximum meaning from each new development; that is guided primarily by 
considerations of content rather than elegance and formal attractiveness; that sees to it that the 
mathematical representation of reality does not degenerate into a game... .” Bishop [4, p. 4] 

22 “To anyone who starts off inside the contemporary mathematician’s belief system, the discovery that 
an entire component of the ‘reality’ of one’s experience is produced by acts of acceptance as such in 
the domain of language use is not merely illuminating. In a literal sense, it is shattering: Once a 
mathematician has seen that his perception of the ‘self-evident correctness’ of the law of excluded 
middle is nothing more than the linguistic equivalent of an optical illusion, neither his practice of 
mathematics nor his understanding of it can ever be the same.” Stolzenberg [17, p. 268] 

23 “All philosophical differences...ought not to affect the detail of mathematics, but only the 
interpretation. Mathematics would be in a bad way if it could not proceed until [the philosophical 
disputes] had been settled.” Russell in 1906, the year before Brouwer’s dissertation provided 
evidence that a constructive position on mathematical existence changes the face of mathematics, as 
quoted in [14, p. 131-132] 

24 Read [15] and [17] 

25 [13, p. 112] 

26 See [15] 

27 Bishop’s book has been born again in a somewhat expanded and altered form in [5] 

28 [4, p. 8] 

29 [3, p. 3] or [5, p. 5] 

30 For much more detail on the consequences of the constructive standpoint, read [3], [4], [5], [6], 
or [11]. 
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Multivariable Calculus and the Plus Topology 


Daniel J. Velleman 


Among the most subtle concepts in multivariable calculus are the concepts of 
continuity and differentiability of functions of two (or more) variables. These 
concepts are designed to tell us about the local behavior of a function near a point. 
Since “local” is defined by reference to the standard topology on R’, the 
definitions of continuity and differentiability must take into account the fact that a 
neighborhood of a point in this topology includes nearby points in all directions, 
not just the coordinate directions. As a result, these definitions involve limits in 
which a point (x, y) approaches a point (a,b), and such limits cannot be under- 
stood in terms of limits in which the variables x and y approach the limits a and b 
separately. This explains why, for example, differentiability of a function of two 
variables is not the same as existence of the two first partial derivatives. 

But now suppose we are interested in studying the partial derivatives of a 
function. Since the partial derivatives are defined in terms of limits with respect to 
the independent variables separately, they cannot be thought of as giving us 
information about the local behavior of the function near a point—at least, not if 
“local” is defined by reference to the standard topology. But what if we use a 
different topology? Is there some topology on R’ that is appropriate for the study 
of partial derivatives, in the same way that the standard topology is appropriate for 
the study of continuity and differentiability? My purpose in this paper is to show 
that there is such a topology, and that the study of this topology can shed light on 
some of the subtleties of multivariable calculus. 

The standard topology on R’ is defined by reference to ¢-balls, where for any 
e > 0 and any point (a, b) € R’, the e-ball centered at (a, b) is defined to be the 


set 
B,(a,b) = (59) ER V(x - a) + (y — b)’ < eh. 


We define the e-plus centered at (a, b) to be the set 
+,(a,b) = {(x,b) € R*||x —al < e} U {(a, y) ER) ly — BI < et}. 


Of course, the reason for the name is that the set +, (a, b) looks like a plus sign 
centered at (a,b), with “radius” ¢; see Figure 1. We say that a set UC R’ is 
plus-open if for every (a, b) € U there is some ¢ > 0 such that +,(a, b) C U. It is 
easy to verify that the plus-open sets form a topology on R’, which we will call the 
plus topology. Clearly every open set is plus-open, but there are plus-open sets that 
are not open. For example, the set 


A = {(0,0)} U f(x,y) € R?| ly| > 3\x|} U {(x, y) € R?| ly| < x| /3} 


is plus-open, but it is not open because it contains no e-ball centered at (0, 0); see 
Figure 2. Thus, the plus topology is strictly finer than the standard topology. 

As evidence that the plus topology is the right topology for studying concepts 
involving limits with respect to the independent variables separately, we offer the 
following theorem. The theorem concerns separately continuous functions, where 
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Figure 1. The sets B,(a, b) and +, (a, 5). Figure 2. The set A. 


a function f with domain R? is called separately continuous if for every b € R, the 
function f(x, b) is a continuous function of x, and for every a € R, the function 
f(a, y) is a continuous function of y. 


Theorem 1. For every topological space Y and every function f: R* > Y, f is 
separately continuous if and only if it is continuous with respect to the plus topology on 
R’. Furthermore, the plus topology is the only topology for which this is true. 


Proof: Suppose that f is separately continuous, and let V C Y be open. Suppose 
(a,b) €f 'V). Then f(a,b) € V, so since the function f(x, b) is continuous, 
there is some €, > 0 such that if |x — a| < e, then f(x, b) € V. Similarly, there is 
some €, > Osuch that if |y — b| < e, then f(a, y) € V. Clearly +, (a,b) cf-'V), 
where ¢ = min(é,, €,). Thus f '(V) is plus-open, so f is continuous with respect 
to the plus topology. 

Now suppose that f is continuous with respect to the plus topology. 
Suppose that (a,b) € R’, and let V be any neighborhood of f(a, b) in Y. Then 
(a,b) Ef 'V) and f-'(V) is plus-open, so there is some e¢> 0 such that 
+.(a,b) cf-'(V). It follows that if |x -—al <e then f(x,b) eV, and if 
ly — b| < « then f(a, y) € V. Since V was arbitrary, this shows that the function 
f(x, b) is continuous at x =a and the function f(a, y) is continuous at y = b. 
Thus, f is separately continuous. 

Finally, to prove uniqueness, suppose that T is another topology on R* with the 
property stated in the theorem. Let Y be R’ with the plus topology, and let 
f: R’ — Y be the identity function. The f is clearly continuous with respect to the 
plus topology on the domain, so by the part of the theorem already proved, f must 
be separately continuous. Thus, f is continuous with respect to the topology 7 on 
the domain. In other words, for every plus-open set U, U = f-'(U) € T, so T is at 
least as fine as the plus topology. Similar reasoning, with the roles of T and the 
plus topology reversed, shows that the plus topology is at least as fine as T, so 7 
must be the plus topology. | 


The plus topology is actually a special case of a kind of product topology that 
has appeared occasionally in the topology literature; see [2] and [3]. There are also 
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related topologies on R* that can be used to study continuity and directional 
derivatives in directions other than the directions of the coordinate axes. However, 
in this paper we restrict our attention to the plus topology on R’. 

Corresponding to the fact that all differentiable functions are continuous, we 
have the following corollary of Theorem 1: 


Corollary 2. Suppose f: R? > R. If the partial derivatives f,, and fy are defined 
everywhere, then f is continuous with respect to the plus topology on the domain R?. 


Proof: It f, and f, are defined everywhere then f must be separately continuous, 
so the conclusion follows from Theorem 1. = 


Since partial derivatives are defined using limits with respect to the independent 
variables separately, the first partial derivatives of a function f at a point (a, b) can 
be computed from the values of f at all points in any e-plus centered at (a, b). 
Applying this fact at every point in a plus-open set proves our next theorem. 


Theorem 3. Suppose that f, g: R* > R, U is a plus-open set, and for all (a, b) & U, 
f(a, b) = g(a, b). Then for all (a, b) € U, f,(a, b) = g,(a, b) and f,(a, b) = g(a, b), 
where each equation should be interpreted as meaning that either both partial deriva- 
tives are undefined, or both are defined and they are equal. 


Mixed higher order partial derivatives of a function f at a point (a, b) cannot be 
computed from the values of f on an e-plus centered at (a, b). However, applying 
Theorem 3 repeatedly leads to the following corollary: 


Corollary 4. Suppose that f, g: R* > R, U is a plus-open set, and for all (a, b) € U, 
f(a, b) = g(a, b). Then all partial derivatives (including all mixed partials) of f and g 
agree at all points in U. 


For example, consider the following two functions: 
x*+y* if(x,y) EA 
—1 if(x,y) EA 


where A is the plus-open set in Figure 2; see Figures 3 and 4. These functions 
agree at all points in A, so by Corollary 4 their partial derivatives of all orders also 
agree at all points in A. In particular, all partial derivatives of f and g agree at 
(0,0). We might say that the partial derivatives at (0,0) look at points only in a 
plus-open neighborhood of (0,0), and therefore they don’t see the difference 
between f and g. But the local (in the sense of the standard topology) behavior of 
these functions is quite different near (0,0). For example, g is differentiable at 
(0,0), and f is not even continuous there. This illustrates the point that partial 
derivatives of a function do not give information about its local behavior. 

This example also makes it clear that it is impossible to tell whether or not a 
function is differentiable at a particular point by examining its partial derivatives 
(of any order) at that point. The test for differentiability given in most multivari- 
able calculus books says that a function is differentiable at a point if the first 
partial derivatives are not only defined but also continuous at that point. In fact, 
examination of the proof shows that it suffices to assume that only one of the 
partial derivatives is continuous, but this example shows why one cannot drop the 
continuity requirement completely. 


f(x, y)= g(x,y) =x’? +’, (1) 
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ayiit 


Figure 3. z = f(x, y). Figure 4. z = g(x, y). 


Here is another well-known theorem from multivariable calculus; see [5, p. 212]: 


Theorem 5. (Second Derivative Test for Local Extrema) Suppose that f(x, y) is 
differentiable in a neighborhood of (a, b), f(a, b) = f,(a, b) = 0, and f, and f, are 
differentiable at (a, b). Let D = f,,(a, b)f,,(a, b) — [fy Ca, b)P. Then: 


1. If D > 0 and f,,(a, b) > 0 then f has a local minimum at (a, b). 
2. If D > 0 and f,,(a, b) < 0 then f has a local maximum at (a, b). 
3. If D <0 then f does not have a local extremum at (a, b). 


Once again, the plus topology can be helpful in constructing and understanding 
examples that illustrate why the hypotheses are needed. It is easy to check that the 
Second Derivative Test correctly determines that the function g in (1) has a local 
minimum at (0, 0). Since the partial derivatives of f and g in (1) agree at (0, 0), the 
test gives the same answer for f, even though f does not have a local minimum at 
(0, 0). Of course, f does not satisfy the first hypothesis of Theorem 5, since it is not 
differentiable in a neighborhood of (0, 0). But it is not hard to modify f to make it 
differentiable everywhere, and still have the Second Derivative Test fail. We 
simply need a surface that is the same as the graph of g on a plus-open 
neighborhood of (0,0), but is concave downward outside of that neighborhood. A 
natural choice would be a surface given in polar coordinates by an equation of the 
form z =c(6)r’, where c(@) is 1 when @ is close to an integer multiple of 7/2 
and c(@) changes smoothly to a negative value when @ is an odd multiple of 7/4. 
For example, we might let c be a function that is periodic with period 7/2 and 
define c(@) for 6 between 0 and 7/2 as follows: 


1 | if0<0< 7/8 
1 
eed 6 + —_- - if 7/8 < 0< 37/8 
a”) eS ae Baye a 
1 if3a7/8 < 0< 7/2. 


The graph of c is shown in Figure 5, and the surface z = c(@)r* is shown in 
Figure 6. This surface is the graph of a function h(x, y) that is infinitely differen- 
tiable at all points other than the origin, since c(6@) is infinitely differentiable, and 
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0.5 


0.5 
Figure 5. y = c(@). Figure 6. z = c(@)r?. 


6 and r are infinitely differentiable functions of x and y in a neighborhood of any 
point except the origin. Furthermore, since c(@) is bounded, there are constants a 
and b such that a(x* + y*) < h(x, y) < b(x* + y’), from which it follows that h is 
differentiable at (0,0). And finally, since h agrees with g on a plus-open neighbor- 
hood of (0, 0), all partial derivatives of h are defined at (0,0) and are the same as 
the partial derivatives of g. Therefore the Second Derivative Test incorrectly 
indicates a local minimum for h at (0,0). The only hypothesis of Theorem 5 that 
we have not checked is the differentiability of the partial derivatives at (0,0), so 
this hypothesis must fail for h, and it cannot be dropped from the theorem. The 
reader might enjoy checking that h,(x, y) = 2xc(6) — yc'(@) and h,(x, y) = 
2yc(6) + xc'(@). Using the fact that c(@) and c’(@) are bounded but not constant, 
it can be shown that the first partial derivatives are continuous but not differen- 
tiable at (0,0). One can get an example where the Second Derivative Test 
incorrectly indicates that a function does not have a local extremum by adding 
z = (1 — c(6))r’ to an appropriately chosen surface with a saddle at (0,0), such as 
z=x*+y*+4+ (2 + e)xy, for sufficiently small positive ¢«. Similar examples can be 
found in [4]. 

All of our examples so far have been based on the plus-open set A, but there 
are many more exotic plus-open sets. For example, let {B,, B,, Bz,...} be a 
countable basis for the standard topology on R*. Inductively choose, for each 
positive integer n, a point (x,, y,) © B, such that for all m <n, x, #x,, and 
Y, £Ym Let F = {(x,, y,), (49, yo), (3, y3),...}. Since F contains a point from 
every basic open set, R* \ F has empty interior in the standard topology. However, 
we claim that R* \ F is plus-open. To see why, suppose (a, b) € R’ \ F. Then 
since there is at most one point in F with y-coordinate b, it is easy to find an 
é, > 0 such that if |x — a| < e, then (x, b) € F. Similarly, we can find an ¢, > 0 
such that if |y—b| <e, then (a,y)¢F. Thus +,(a,b) C R’\F, where 
€é = min(é,, €,). 

Unusual plus-open sets can lead to unusual examples in multivariable calculus. 
For example, define j: R* > R as follows: 


1 if(x, 
JOG = \ 9 ae @) 
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Then j agrees with the constant function c(x, y) = 0 on the plus-open set R’ \ F, 
and therefore by Corollary 4 all partial derivatives of j are defined and equal to 0 
everywhere except for the countably many points in F. But since both F and 
R’\ F are dense in the plane in the standard topology, j is discontinuous 
everywhere. 

Can the countable set of exceptional points in this example be avoided? Can a 
function have partial derivatives defined everywhere but be discontinuous every- 
where? The answer is no, but to see why we need a fact about closures in the plus 
topology. For a set X C R’, we write cl(X) for the closure of X in the standard 
topology, and cl,(X) for the closure of X in the plus topology. Note that 
cl,(X) C cl(X), since the plus topology is finer than the standard topology. For 
X CR we also write cl(X) for the closure of X in the standard topology on R. 

Our example R* \ F shows that a nonempty plus-open set can have empty 
interior in the standard topology. However, this cannot be true of the closure in 
the plus topology of a nonempty plus-open set. In fact, we have the following 
slightly stronger theorem, which implies that R* with the plus topology is a Baire 
space: 


Theorem 6. Suppose U is a nonempty plus-open set, and U = U,, -7+U,. Then for 
some n, cl,(U,) has nonempty interior in the standard topology. 


Proof: Let (a,b) € U, and choose «> 0 such that +,(a,b) CU. For each 
x€(a-—e,a+e) and nEZ", let Y* ={yl(x,y) © U}, and let Y* = 
Unezt VY, = {yl(x, y) € U}. Since (x, b) € +,(a, b) C U and U is plus-open, Y* 
must contain an interval. Thus, by the Baire Category Theorem, there is some 
positive integer n, such that cl(Y,") contains an interval. Choose rational numbers 
p, and q, such that p, <q, and (p,,q,) © cl(Y,"). 

For each positive integer n and rational interval (p,q), let X,,, = 
{x € (a —- e,a+ €)|n, =n, p, =p, and q, = q}. Since there are only countably 
many possible values for n, p, and g, another application of the Baire Category 
Theorem shows that there must be some n, p, and q such that cl(X,, , ,) contains 
an interval. Choose c <d such that (c, d) C cl(X,, ,,,). For each x EX, 
(p,q) C cl(Y,*), and it is not hard to see that therefore x, ng X§ (@ cad (0), 
Similarly, since (c,d) Ccl(X, , ,), it follows that (c, d) x (p,q) Cel (U,), as 


n,p,qd 
required. a 


Using Theorem 6, we can prove the following theorem of Baire; see [1] and [6]: 


Theorem 7. (Baire) Suppose f: R? > R and suppose f, and f, are defined at all 
points in R*. Then there is a dense set of points at which f is differentiable. 


Proof: For h # 0 define functions m, and n, as follows: 


f(x+h,y) —f(% y) _ f(y +h) —f(%y) 
Dy) ee I 


Note that m, and n, are separately continuous, since f is. Of course 
lim, 9™,(x, y) = f(x, y) and lim, _, 91,(x, y) = f,@, y). 

We claim first that if V is any nonempty open set and «> 0 then there 
is a nonempty open set W such that cl(W) CV and for all (u,v), (x, y) € W, 
If(u,v) — f(x, Wl <e and |f(u,v) — f,(x, y)| < e. To prove the claim, first 
choose a nonempty open set X such that cl(X) C V. Now for each positive integer 
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n and rational numbers p and gq, let 
U, pg = {(*,y) €X| for all h, if 0 < |h| < 1/n then 


nH, p,q 
|m,(x,y) — p| < e/3 and |n,(x, y) — | < €/3}. 


Clearly UtU, , Jn € Z* and p,q € Q} = X, so by Theorem 6 we can choose 
n © Z* and p,q € Q such that cl,(U, , ,) Has nonempty interior in the standard 
topology. Let W be the interior "of cf ace p,q Then clW) c cl(X) CV, and 
using the fact that m, and n, are separately continuous, it is not hard to see that 


Wecl,(U,,.,) € {(*, y) € R’|for all h, if 0 < |h| < 1/n then 


lm,(x,y) — p| < e/3 and |n,(x,y) —q| < o/ 3). 


It follows that for all (x, y) € W, If,(x, y) — p| < &/3 and |f,(x, y) — q| < €/3, 
and therefore for all (u,v),(x,y) € W, |f.(u,v) — f(x, y)| < 26/3 < € and 
If, (Cu, v) — f(x, y)| < 22/3 < e, as required. 

Now let V, be any nonempty bounded open set. To prove the theorem, we must 
find a point in V, at which f is differentiable. By the claim, let V, be a nonempty 
open set such that cl(V,) CV, and for all (u, v), (x, y) EV;, |f,(u, v) — f(x, y)| <1 
and |f,(u, v) — f(x, y)| < 1. Applying the claim again, let V, be a nonempty open 
set such that cl(V,) C V, and for all (u, v), (x, y) € V,, |f.(u, v) — f,.Cx, y)| < 1/2 
and |f,(u, v) — f,(x, y)| < 1/2. In general, given V,, we choose a nonempty open 
set V,,, such that cl(V,,,) CV, and for all (u,v), (x, y) © Vi,,, 
If (u, v) —f,(x, y)| < 1/(n + 1) and fu, v) — f(x, Wl < 1/(m + 1). 

Let (a,b) € ON, <7+V,. Then for every positive integer n, (a,b) € V,, and for 
every (x,y) €V,, If,Ca, Dye fx, y)| <1/n and |f,(a, b) — f(x, y)| < 1/n. It 
follows that f, and f, are continuous at (a, b), and therefore f is differentiable at 
(a, b), as required. a 


Returning to our function j in (2), we can now see why the exceptional points 
cannot be avoided. If the partial derivatives of a function are defined everywhere 
then, by Theorem 7, it must be not only continuous but also differentiable at a 
dense set of points. Our function j shows that the hypotheses of Theorem 7 cannot 
be weakened to allow a countable set of exceptional points. 

We close by mentioning two unusual properties of the plus topology that 
distinguish it from the standard topology on R’. The first follows almost immedi- 
ately from Theorem 6: 


Theorem 8. The plus topology is not regular. 


Proof: We have already seen that R* \ F is plus-open, so F is plus-closed. Let 
(a, b) be any point not in F. We claim that (a,b) and F cannot be separated by 
plus-open sets. To see why, suppose that U and V are disjoint plus-open sets with 
(a,b) € U and F CV. Then cl,(U) has empty interior in the standard topology, 
contradicting Theorem 6. | 


The second unusual property of the plus topology is that it is not second 
countable, or even first countable. In fact, it is surprisingly difficult to find a 
natural basis for the plus topology. Note that the sets +, (a, b) are not plus-open, 
and therefore cannot be used as basis sets. It turns out that for any point 
(a, b) € R’, any local basis at (a, b) for the plus topology must have 22"° elements. 
This follows from more general results in [2]. 
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The Forced Damped Pendulum: 
Chaos, Complication and Control 


John H. Hubbard 


We show that a “simple” differential equation modeling a garden-variety damped 
forced pendulum can exhibit extraordinarily complicated and unstable behavior. 
While instability and control might at first glance appear contradictory, we can use 
the pendulum’s instability to control it. Such results are vital in robotics: the forced 
pendulum is a basic subsystem of any robot. 

Most of the mathematical methods used in this paper were initially developed in 
celestial mechanics, largely by Poincaré. The literature of the field tends to be 
quite advanced indeed (see [1] and [11]); one object of this paper is to show that 
computer programs, properly used, can make these advanced topics transparent. 
All the computer-generated pictures in this paper were produced by the programs 
Planar Systems and Planar Iterations [6], both written by Ben Hinkle (now at 
Maple). 


1. SOME PARALLELS IN CELESTIAL MECHANICS. When I was a graduate 
student, I was amazed by the results of Alekseev concerning a system formed by 
three bodies obeying Newton’s law of gravitation; see [1] and [11]. As shown in 
Figure 1, two massive bodies of equal mass move in a plane P on ellipses 


the satellite (mass 0) 


a 
the massive bodies 


Figure 1. Alekseev’s three-body system. 


symmetric around a common focus F, and the third body, the satellite, of mass 
zero, moves on the line L perpendicular to P through F. Once this satellite is 
launched, its motions are determined uniquely by the gravitational pull of the two 
massive bodies. 

The system has a natural unit of time, the “year”—the time it takes the massive 
bodies to complete a revolution. Choose a time zero, so that it makes sense to 
speak of the Oth, Ist,..., mth year. Also let x denote the position on the line L, 
with x = 0 corresponding to F. 
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Alekseev proved that there then exists a number N, which depends on the 
eccentricity of the orbits of the large bodies, such that given any sequence 
N,,N>5,... Of integers at least N, there exists a set of initial conditions that results 
in the satellite returning to cross the plane P exactly in the n,th year, the 
(n, + n,)th year, etc. In other words, given a specified sequence of years with gaps 
at least N, it is possible to choose an instant ¢, and a speed v = x'(t,) so that if 
the satellite is kicked off at that moment with that speed, it crosses the plane 
during the desired years: first during the n,th year, then n, years later, and so on. 
You can set up the satellite to return in any sequence of years you like, so long as 
the returns are spaced at least N apart. 

In particular, there exist unbounded orbits in which the satellite travels arbitrar- 
ily far away but always returns, for example the orbit corresponding to the 
sequence of gaps between crossings N,N + 1,N+ 2,N+3,...) as well as in- 
finitely many different periodic orbits (for instance N,N + 12, N+ 17,N, 
N+ 12,N+4+17,...). 

Actually, Alekseev claimed the result only when the eccentricity is “sufficiently 
small.” He needed to know that his system satisfied some requirements (basically, 
that a “horseshoe” should be present), and he could verify this only by a 
perturbation calculation near an explicitly integrable system. Horseshoes are 
discussed in Section 8. 

The pendulum model we explore here exhibits a similar sort of behavior: we can 
make our pendulum go through any specified sequence of gyrations by correctly 
choosing the initial conditions. More precisely, by appropriately choosing the 
position and the velocity of the pendulum at time 0, we can specify whether during 
each time period (the time period of the forcing term, in our case, 277) the 
pendulum goes through the bottom position once clockwise, once counterclock- 
wise, or not at all. For example, we could specify that in each of the first six 
periods it could go through the bottom position once clockwise, in each of the next 
three periods it could go through the bottom position once counterclockwise, and 
in the tenth period oscillate around an upright position.... All imaginable 
sequences are possible: once the correct set of initial conditions is chosen, the 
differential equation governing the system automatically enforces the desired 
behavior. 


2. DIFFERENTIAL EQUATIONS AND PENDULUMS. There is only one law in 
mechanics: F = ma (force equals mass times acceleration). Thus the motion of a 
pendulum of length /, with a bob of mass m in a constant gravitational field of 
force g, with friction proportional to the velocity, and forcing f(t) (Figure 2) is 
modeled by the differential equation 


— Ee pe 1 = " 
ee ee ee 
force mass x acceleration 


The friction term yl’ is a fairly good approximation to reality when the friction is 
due to air, and the speed of the bob is much less than the speed of sound. The 
term mg sin(x) is the force exerted by gravity; the weight of the body is mg, but 
only the component in the direction of motion contributes to the equation. The 
forcing f(t) can be created by a current proportional to f(t) through the axis of 
the pendulum, if the bob is a bar magnet perpendicular to the axis. In realistic 
situations (e.g., robot arms), this is the way forcing is really produced. 
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Figure 2. A pendulum being driven by alternating current. 


We explore the behavior of a pendulum whose motions are described by the 
particular differential equation 


cos(t) — 0.1x’ — sin(x) =x", 


in which both mass m and length / equal 1. 

My starting point was the observation by Borelli and Coleman [3] that numerical 
solutions of this equation are very sensitive to the integration method, step-length, 
etc., near the initial condition (x(0), x’(0)) = (0,2). That is, we start with a 
pendulum hanging down, and hit it with a mallet to give it velocity near 2. This 
paper is my attempt to understand this instability. The behavior I describe holds 
not just for the parameters m, y, /, g, f(t) given; they could be varied in a certain 
range, which I don’t know in any detail, but which is large enough so that it would 
not be difficult to build a real system that behaves like the one described here. 


3. A FIRST ATTEMPT TO UNDERSTAND THE MOTIONS OF THE PENDU- 
LUM. The most obvious thing to ask a computer is: what do the motions of the 
pendulum look like? The following picture shows the motion resulting from 15 
different sets of initial conditions. Each graph starts with the position x(0) = 0; the 
initial velocities are evenly spaced between 1.85 and 2.1. The graphs are plotted for 
—1<t< 200 and —25 <x < 25. A word of caution: the overall features of 
Figure 3 are correct, but the details—exactly which equilibrium each initial 
condition leads to—might well be wrong. The exponential growth of errors is 
discussed in Section 11. 

A careful look at the picture suggests that there exists a stable periodic motion 
S(t) of the pendulum, which you see in the picture many times; of course, 
S(t) + 2k is another description of the same motion for any integer k; the letter 
S stands for “stable.” You will see five different levels of this stable periodic 
motion: one on the horizontal axis, three above, and one below. The first stable 
motion above the horizontal axis represents motions that go “over the top” once 
counterclockwise before settling down, like a child’s swing going over the bar. The 
next layer up represents motions that go over the top twice counterclockwise 
- before settling down, while the layer below the horizontal axis represents motions 
that go over the top once clockwise before settling down. 
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Figure 3. Fifteen solutions to the differential equation cos(t) — 0.1x’ — sin(x) = x”. 


Some motions rapidly settle down to this oscillation, others go through a 
complicated path before doing so, and yet others do not approach the periodic 
motion in this amount of time. These appear to be rare, and one might guess that 
given more time, almost all solutions do settle down. (One that does not is shown 
in [13, p. 228]; the existence of uncountably many others is proved in Theorem 3.) 

An obvious question is: what stable oscillation—what attracting periodic solu- 
tion—can a motion approach? This seems impossible to understand without 
another program. | 


4. THE SCANNING PICTURE. We now look at the whole family of initial 
conditions: position represented by the horizontal axis, velocity by the vertical axis. 
We ask the computer to color initial conditions according to the stable oscillation 
the corresponding solution approaches (if any). This set of initial conditions is 
called the basin of the corresponding sink; it is an open subset of R’. 

This is best done as follows. First, find the initial values S,(0), S,(0) for one of 
the attracting periodic solutions, say the one with —27a < §,(0) < 0. We call the 
motion immediately above it S,, and the one above that $,; we have S,(t) = 
S(t) + 2k. Next, find a number r > 0 such that if 


|x(0) — $,(0) |” +|x'(0) — Sy(0) |? <r’, 


then the motion x(t) is definitely attracted to $,. That is, any set of initial values 
inside the circle of radius r and centered at (S,(0), S5(0)), gets arbitrarily close to 
the solution S, (in fact, does so exponentially fast). We rely on computer calcula- 
tions to determine this, but it would not be hard to provide a rigorous mathemati- 
cal justification. We are not particularly interested in the points inside that circle; 
we are just establishing how we know that a motion is attracted to a particular 
attracting solution: it is attracted to it if it ever enters the circle of radius r around 
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the solution. In our case, we have 
(S)(0), S(0)) = (—2.0463, .3927) andwecantake r=0.1. 


Now we solve the differential equation starting at every point of some grid (in 
our case, a 600 X 400 grid—240,000 points!), and sample the solution at times 
27,47,...: this is a substantial computation, taking about two hours even on a 
fairly fast Macintosh (200 MHz). 

If for some such motion w(t) and some integer n > 0 we have 


lw(2nm) — S,(0)|° +|w'(2nar) — S(0)| <r’, 


we know that this motion is attracted to S,. Color the point (w(0), w’(O)) in the kth 
color and solve the differential equation for the next point. If after some number 
of samplings (in our case 30: we integrated solutions for time 607 =~ 185) the 
solution never falls within r of an attracting solution, leave the initial point white. 
We obtain Figures 4 and 5. 


5. LAKES OF WADA. The colored sets B, (called, for obvious reasons, the basins 
of the corresponding attracting motions $, are immensely complicated. 

We show that they form infinitely many Lakes of Wada. Wada was a Japanese 
mathematician who at the beginning of the 20th century constructed an example of 
three disjoint, connected open subsets of the unit disc D Cc R* such that every 
point in the boundary of one is in the boundary of the other two [15]. This amazed 
the mathematical community at the time: if you try to draw three (connected, open) 
lakes in an island, you would probably soon convince yourself that all three can 
touch at only two points. Actually, it appears that Brouwer discovered this 
phenomenon earlier [4]. 


Figure 4. The different colors (hard to appreciate in black and white) represent different basins: which 
initial conditions are attracted to which sinks. Points colored white may be initial conditions that are 
never attracted to a sink, but more likely they are attracted to sinks that are off the picture. They could 
also be attracted to sinks in the picture, but not during the time allowed. 
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Figure 5. In black and white, the four basins of Figure 4 are hard to distinguish. This figure represents 
just one basin. 


Let me sketch the construction as outlined in [15], illustrating the dangers of 
philanthropy; this is illustrated by Figure 6. 

Suppose D is an island cursed with three philanthropists, one of whom wants to 
bring water to every inhabitant, one tea, and one coffee. At the beginning each has 
a pond of his own beverage. 


Figure 6. Digging the lakes of Wada. 
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First, the purveyor of water digs a system of canals emanating from his pond, 
and bringing water within 100 meters of every inhabitant, never actually touching 
the surrounding sea or the other ponds, and forming no loops. 

Next, the purveyor of coffee builds a system of canals emanating from his pond, 
bringing coffee to within 10 meters of every inhabitant, again forming no loops. 
Since the water canals make no loops, they don’t cut off any inhabitants from the 
coffee pond, so this is possible. 

Now the purveyor of tea builds his system of canals, bringing tea to within 1 
meter of every inhabitant. Next the water purveyor goes back to work, extending 
his canals (necessarily building narrower ones) to bring water within 10 cm of each 
inhabitant. And so forth. At the end of this process, the poor inhabitants no longer 
have any dry land to stand on, but they have water, tea, and coffee as close as they 
want. What remains of the dry land is in the boundary of all three basins. 

Real philanthropists don’t seem to behave this way, fortunately. Highway 
designers, on the other hand... 

Theorem 1 shows that our pendulum is creating lakes of Wada. 


Theorem 1. The basins B, have the Wada property. every point in the boundary of one 
basin is also in the boundary of all the others. 


This is not quite as strong as the preceding statement about philanthropists, 
where every bit of dry land was in the boundary of all the basins. For the 
pendulum, all we can prove is that if a point is in the boundary of one basin, it’s in 
the boundary of the others. Presumably there is no other dry land, but we don’t 
know how to prove it. True lakes of Wada have been proven to exist in another 
setting of dynamical systems [7]. 

The first step in understanding why Theorem 1 is true is to get a grasp on the 
boundaries of the basins. Most of the material in the next section was developed by 
Kennedy, Nusse and Yorke; see [9] and [12]. They saw that the basin of a sink 
often has saddle points on its boundary, and that the stable separatrices of these 
saddle points make up the accessible boundary of the basin. We will first define 
these words. 


6. ITERATION, SINKS, SADDLES, SEPARATRICES. Rather than thinking of 
the differential equation in R°, I find it much easier to think of the period mapping 
(or Poincaré mapping) in the plane 


P:R* > RR?’ givenby P: 


ind _ ee 
x'(0) x'(27) | 


This enables me to ignore what motions do between the samples. 

There is no real loss if we are interested in long-term behavior: iterating m times 
the mapping P is equivalent to solving the differential equation for time 27, 
sampling the solutions every 277. But the dynamical objects are now subsets of the 
plane rather than of space: most people visualize objects in the plane much better 
than in space. In our case, the planar objects are quite complicated enough. 

Seen this way, each point s, = ($,(0), S/.(0)) is an attracting fixed point of P, 
also called a sink: P(s,) = 5s, and if a point p is close to s, (within r of it, for 
instance), its orbit under P approaches s,. The basin B, is exactly the set of points 
p such that the sequence p, P(p), P*(p),... approaches s,. 
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Sinks can also be periodic of period m > 1. Such sinks are points p such that 
P™(p) = p, and such that if a point p, is sufficiently close to p, the sequence, 
Pp, P™(p,), P?"(p,),... tends to p. That is, the solution of the differential 
equation with (x(0), x'(0)) = p is an attracting periodic solution of period 2mz. 
Our mapping P appears not to have any such points (for these values of the 
parameters), although proving that it has none.may well be an unsolvable problem. 
But there are infinitely many periodic saddles, as is proved by Theorem 3. And 
there are infinitely many more whose existence is not guaranteed by that theorem. 

Like a sink, a saddle point for P corresponds to a periodic solution of the 
original differential equation, but while sinks are associated with stable equilibria, 
saddles are associated with unstable equilibria. A periodic solution (x(t), x'(t)) of 
the differential equation gives a saddle (x(0), x'(0)) of the period mapping P if 
there is a surface made up of solutions of the differential equation that tend to the 
attracting periodic solution as time tends to +, and another surface of solutions 
that tend to the attracting periodic solution as t > —%, Le., as one travels 
backwards in time. 

An example of a saddle point is the upwards (unstable) equilibrium for an 
unforced damped pendulum. Almost all solutions are captured by a stable equilib- 
rium. But exceptional solutions exist that take an infinite amount of time to 
approach the vertical, and other solutions take an infinite amount to fall away 
from the vertical: these solutions make up two surfaces that intersect along the 
constant solution corresponding to the unstable equilibrium. The surface of 
solutions that tend to the vertical in forward time is the stable separatrix, while the 
surface of solutions tending to the vertical in backwards time is the unstable 
separatrix. The intersection of these surfaces with a Poincaré plane (i.e., the plane 
t = 0) forms two curves, also referred to as separatrices. Think of the separatrices 
as watersheds: for our unforced pendulum, they separate the initial conditions that 
go over the top one more time from those that don’t make it. 

Mappings R* > R* (which might be the period mapping of a time-periodic 
differential equation in R’, as in our case) usually also have sources: fixed or 
periodic points that repel all nearby orbits. The period mapping P for our 
pendulum has no sources because P contracts areas by e *7/'° = 0.53, due to the 
damping [8, vol. 2, chap. 8]. No mapping can simultaneously contract areas and 
map some region to a strictly larger region, as would have to happen near a source. 
Of course, P~' has sources wherever P has sinks. 


7. SADDLES IN THE BOUNDARY OF B,. The computer finds four saddles 
Pxiv+++>Px,4 I the boundary of each basin. These saddles form two cycles of 
period 2 (i.e., the solutions of the differential equation with initial values at these 
saddles have period 477). The boundary of the basin appears to be made of their 
stable separatrices, as drawn in Figure 7. We will call these separatrices o*( py ;): 
these are the watersheds that separate the solutions falling into the basin from 
those that don’t. 

In fact, the preceding statement is not true: the boundaries of the basins are not 
just the separatrices; they are much more complicated than that. The complication 
stems from the fact that all points of the boundary are limits of sequences in the 
basin, but not all such points are limits of paths. Consider Wada’s construction: 
some points of the boundary of the water are on the edge of some water stream, 
but most are not. For one thing, points on the edge of a coffee stream are not on 
the edge of a water stream, even though they are in the boundary of the water: 
there are water streams arbitrarily close, but tea streams even closer, etc. Such 
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Figure 7. The stable separatrices of the saddles of period 2 in the boundary of a basin provide an 
outline drawing of the basin. Thus this picture is more or less the same as Figure 5, but the stable 
manifolds would need to be continued for a very long time to get as much resolution as figure 5 
provides. 


points are inaccessible by water: you can reach out to them over other streams, 
with an arbitrarily small motion, but you cannot reach them in a boat. Most points 
of the common boundary (the separator) are not accessible from the water, coffee, 
or tea. 

Our basins are similar to those of the Wada example. Each includes a central 
“pond” with four canals leading off from it, which dwindle to become infinitely 
narrow streams, intermingled with streams belonging to other basins. 

In our case, the inward pointing unstable separatrix at each of the four saddles 
is attracted to the sink, as shown in Figure 8, and provides a path from the sink to 


Figure 8. A basin cell; the points P~‘(c,) illustrates the proof of Theorem 2. 
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the stable separatrix of the saddle. Thus the stable separatrix is part of the 
accessible boundary. 


Theorem 2. The accessible boundary of B, is exactly the union of the stable separatri- 
ces o'(p, i= 1,...,4. 


The proof consists of looking at Figure 8. 

The colored neighborhood C, of the sink s, (called a basin cell in [12]) is 
bounded by arcs of four stable separatrices o*(p, ,) and arcs of the four unstable 
separatrices o (p, ;), which except for endpoints are contained in the interior of 
the basin. Thus any accessible boundary point q of B, not in U;o*(p, ;) is 
necessarily outside C,, and a path | 


y:[0,1] > B,, y([0,1]) cB, 


joining q to s, intersects one of these four arcs, in points c,. Similarly, the path 
P™(y) intersects one of these arcs in a point c,,. The points z,, = P~’(c,,) must 
be on y, and must converge to qg since for any e€ > 0, the set y((0,1 — e]) is a 
compact subset of B,. Thus P’”(y([0, 1 — e])) is inside C, (or any neighborhood of 
s, for m sufficiently large). 

But the c,, lie in four compact arcs of U, a (p, ;), hence P~”(c,,) is very close 
to one of the saddles for m large. So q is one of the saddles p, ;, and hence is on 
its stable separatrix. 

This ends the proof of Theorem 2 (or at least a fairly convincing argument; 
it is not a rigorous proof, as we discuss in Sections 11 and 13); now to justify 
Theorem 1. 

First, it is enough to show that each accessible point of JB, (the boundary of 
B,) can be approached by every other basin. Indeed, every point of dB, can be 
approached by accessible points, so if we can show that each accessible point of 
dB, is in the boundary of every other basin, then every point of JB, is in the 
boundary of every other basin. 


Figure 9. All four of the unstable separatrices from the points pg; enter both B, and B_,. 
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Second, it is enough to know that the four outward pointing branches of the 
unstable separatrices for the four accessible saddles in 0B) enter every basin. 
Indeed, if the four unstable separatrices o ( Po, J), for i = 1, 2, 3, 4, enter B,, then 
the inverse images of B, accumulate to py ;, hence to the entire stable separatrix 
o* (po, ;). This shows a little more: if all four o (py ;) enter B,, then no curve can 
enter By without crossing a stream of B,, i.e. entering B.,,. 

Third, rather than show that the outward-pointing part of each o (pg ;) enters 
all the basins B,, for nm any integer, it is enough to show that it enters the two 
neighboring basins B, and B_,. We can prove this by induction. Figure 9 shows 
that the four separatrices o ( Po, ), 1 = 1,2,3,4 enter the basins B_, and B,. 

Now suppose they enter B, for some k > 1. But they cannot enter B, without 
entering B,,,, because the o,, enter B,,,, so that their inverse images give 
streams of B,,,, which they must ford to enter B,. 


n? 


8. SOLUTIONS NOT ATTRACTED TO THE SINKS. In this section we use 
techniques mainly due to Smale [14] to show that the differential equation for our 
pendulum has trajectories that carry out any specified sequence of gyrations. 
During one time interval J, = [2k7,2(k + 1)7) a solution (x(t), x'(¢)) may satisfy 
x(t) = 0(mod 277) exactly 


[—1] once with x’ < 0, 
[O] never, 
[1] once with x' > 0, 
[NA] none of the above. 


These events correspond to the pendulum crossing the downward position 
exactly once clockwise, not crossing it, crossing it once counterclockwise, or doing 
something else. In particular, the attracting solutions belong to the “none of the 
above” category, because they cross the downward position twice during each 
period. So, eventually, do all solutions that are attracted to them. Thus Theorem 3 
describes solutions entirely contained in the separator, which are never attracted 
to one of the sinks. 


Theorem 3. Given any bi-infinite sequence of events ... E_,, Eo, Ey,... with E, € 
{L—1], [0], [1]}} (but not [NA]), there exists a solution of our differential equation that 
during each time interval [2k7, 2(k + 1a) will “do” E,. 


Thus given any sequence of gyrations one might choose, there is a solution that 
does exactly that. In particular, any sequence (E;) of period m and that sum 
to 0 over one such period corresponds to a periodic cycle of period m for P. 
Theorem 3 is very similar to Alekseev’s theorem, and is proved the same way: by 
exhibiting a Smale horseshoe. In Alekseev’s case this requires a delicate perturba- 
tion argument; we show how the computer can make such a result transparent. 

We have found a sequence of fixed sinks s, that correspond to the downward 
equilibrium of the unforced pendulum. There is also a sequence of fixed saddles 
corresponding to a periodic solution of the original differential equation of period 
27a near the unstable upward equilibrium. If you draw a sequence of quadrilaterals 
Q, roughly aligned with the stable and unstable separatrices of these fixed saddles, 
as in Figure 10, you expect the image of such a quadrilateral to be compressed 
in the stable direction and stretched in the unstable direction, becoming long 
and filiform. 
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Figure 10. The quadrilaterals Q_,, Q), Q,, together with the forward and backwards images of Q>. 


We now describe the set of points 
Q,( Eo, £y,..., Ey) = {pIP"(p) € Ore y+ ~ +E | forO<n< N}. 


Let Ay, By, Cy, Dy denote the corners of Q,, as shown in Figure 10. The set 
P(Q)) is the curvilinear quadrilateral Q), shaded in Figure 10, with vertices 
Ay, Bo, Co, Do. The key property of the image is that it crosses the quadrilaterals 
Q, and Q_.,, as well as itself, in each case going from top to bottom (or bottom to 
top), with the top A,B, and bottom C,D, mapping outside these quadrilaterals. 

This implies that each of Q,([—1)), Q,(0D, Q,(1)) forms a full-width subrectan- 
gle of Q). Figure 11 shows the forward and backwards images of Q,, Q_, and Q,, 
and a blow-up of showing how these intersect Q). Indeed the backwards images 
. (light shading) form full-width subrectangles. Of course, Q, and Q_, also contain 
such subrectangles Q,(E,), etc. The inverse image P~'(Q,(E,)) is then again a 
(thinner) full-width subrectangle Q,(E), E)). 


Figure 11. The forward images of Q_,,Q,),Q,, and their intersections with Q,. At right a blow-up 
of Qo. 
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Figure 12. How the quadrilaterals move during one period. 


Continuing this way, we see that for any finite sequence (EF), E,,..., Ey), the 
corresponding set Q,(Ep, E,,..., Ey) is a full-width subrectangle of Q). Finally, 
the assignment of an infinite forward trajectory restricts the initial position to an 
infinite intersection of nested full-width subrectangles of Q); such an intersection 
is a connected subset of Q, connecting one side of Q, to the other. In fact, it is a 
smooth curve, but this requires writing some inequalities. 

A similar argument shows that any finite backwards trajectory restricts the final 
position to a full-height subrectangle of Q), and an infinite backwards trajectory 
leads to a connected subset joining A,B, to C,D, (again in fact a smooth curve). 
If X,Y C Q, are connected subsets, with X joining D, A, to B,C, and Y joining 
A,By to CyDy, then XM Y # 0. Thus there is a point realizing any prescribed 
symbolic trajectory. 

Finally, I claim that the points of Q,(/—1]), Q,(0), Q,([1) realize the events 
[—1],[0], and [1], respectively. Figure 12 shows the images of Q,(+ 1) and Q,(-1) 
at times 


27 47 O67 8a 107 127 147 167 


oa er Or a a a 
The first set certainly seems to cut the line x = 27 exactly once with y > 0; the 
second set seems to cut the line « = 0 once with y < 0. 


9. CONTROLLING THE PENDULUM. Imagine that the pendulum is massive, 
and is being used as a flywheel to control some very delicate operation, like 
polishing the mirror of a telescope. An array of lasers is constantly monitoring the 
operation, deciding on the fly whether the pendulum should turn clockwise, 
counterclockwise, or wait until the mirror has been repositioned. 

The previous section showed that there are motions of the pendulum perform- 
ing any specified sequence of gyrations, in particular the one required a posteriori 
by the polisher. But on second thought this seems useless: these motions are 
extremely unstable, and the slightest error in the initial condition destroys them, as 
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well as any perturbation of the differential equation itself. But if the machine is to 
perform any work, this inevitably perturbs the differential equation, in a way that 
is essentially unpredictable (you cannot predict how much work one swipe of 
the polisher will accomplish), and in any case we don’t know ahead of time 
the sequence of swipes and stops the task will require. 

On third thought, we see that the instability of the specified motions is exactly 
what should make them useful! Suppose that our array of sensors controls the 
current f(t) that is forcing the pendulum, changing it from cos(t) to something 
like 


(1 + a(t))cos(t) (amplitude modulation) or 
cos((1 + a(t))t) (frequency modulation), 


where a(t) represents the fine-tuning necessary to achieve the desired sequence of 
gyrations. The point is that we do not have to figure out what sequence we want 
ahead of time: the sensors can react to the polishing of the telescope on the fly, 
computing the adjustment a(t) that is necessary. It is because of the instability that 
you can keep a(t) small and still realize any sequence of gyrations: you don’t need 
to grind to a halt, compute, and start up again; the corrections can be done 
smoothly. A useful analogy is skiing: a beginning skier plants his skis well apart, 
seeking stability, which is fine until he tries to turn and discovers he can’t. An 
expert skier, with skis parallel and touching, is highly unstable, and a slight wiggle 
of the hips allows him to negotiate a mogul. Of course he doesn’t plot his entire 
path at the top of the mountain; he calculates the slight adjustments a(t) as they 
are needed. 


Theorem 4. For any sequence of events E,, E,,... and any sufficiently small distur- 
bance b(t) of the forcing term cos t, there exists a function a(t) of the same order of 
magnitude as b(t) and an initial condition x(0), x'(O) such that the solution of the 
differential equation 


x" + O.1x'(t) + sin(x) + b(t) = (1 + a(t)) cost 


with those initial conditions realizes the specified sequence of events. 


This result is fairly obvious: choose a(t) as the pendulum approaches the 
upwards position so as to speed it up or slow it down as required. The problem is 
how to compute the a(t), in terms of available data. Clearly a(t) should depend 
only on the values of b up to time ¢ — 277; it should not depend on the specified 
sequence of events very far ahead, as this is unknown. How small can a(t) be 
made? How far ahead in the required sequence of events does it need to look? 
How sensitive is it to small errors in the sensors? ... 


10. CONTROL AND CELESTIAL MECHANICS. To return to celestial mechan- 
ics for a moment, it is interesting to note that when sending a spaceship to visit the 
outer solar system, NASA uses the instabilities of the differential equations 
describing gravity in much the same way as we have used the instabilities of the 
pendulum. It is well beyond present-day engineering to send a spaceship out of the 
solar system by simply using its fuel to accelerate it. Instead, it is allowed to “fall” 
into the sun, with an orbit that passes close to Venus. It then loops around Venus; 
we can imagine that it is the “satellite” in the three-body system consisting of 
itself, Venus, and the sun. 
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This system is similar to Alekseev’s (somewhat more complicated: a Poincaré 
section would need to be 4-dimensional rather than 2), and one can prescribe an 
orbit so that the space ship steals a tiny amount of potential energy from Venus, 
speeding up enormously in the process, and ends up in a very unstable state where 
a small push by guidance rockets can put it on the path to Jupiter. 

This scenario is then repeated near Jupiter, Saturn, and Uranus, with the 
spaceship each time gaining momentum, and using small pushes to head itself in 
the direction of the next destination. Thus the chaos of the solar system is essential 
to its exploration. | 


11. WHAT IS PROVED? To what extent does this paper prove anything? As 
written, no statement is proved anywhere: for the punchline we just looked at a 
computer picture. How do we know that these pictures are right? I do not address 
the possibility that the programs have essential bugs and are computing something 
other than what I think, or the esoteric possibility that the computer arithmetic is 
wrong. But even if the computer is computing exactly what I think, that is still only 
an approximation to solutions of the differential equations; we need to quantify 
the quality of the approximation. The contribution of round-off error also should 
be addressed. 

Actually, many of the results are not hard to prove rigorously, namely all those 
where we have to show that after time 277, solutions are within some fairly large € 
of the value suggested by the computer drawings. 

Good estimates of long-term errors of numerical approximations to solutions of 
differential equations are notoriously hard to come by, but that is not really a 
problem here. First, we do not need good estimates (solutions need only be 
accurate to about 0.1); second, the time considered is not long (27); and most 
important, the differential equation has a small Lipschitz constant (¥2.001 < 1.42). 
Errors in solutions to differential equations grow at most exponentially, at a rate 
e*' where ¢ is time (in our case, 277) and k is the Lipschitz constant; with 
k < 1.42, errors grow at a fairly small interest rate, and can be controlled for a 
short time. 

Using these numbers, a straightforward computation using the fundamental 
inequality ([8, Chapters 4 and 6]) shows that if the initial velocity satisfies |x’(0)| < 3, 
then Euler’s method with step-length h = 0.000002 gives results accurate to 0.1 
after time 27. Moreover, the same inequality shows that round-off error con- 
tributes a much smaller error yet. This is not a good way to do such numerics; 
better numerical methods give much better estimates [5]. For instance, formula 
(14) of [2] can be used to show that the fourth order Runge-Kutta method with 
step 0.005 has more than the needed precision. 

A word of caution, though. The elementary bound above says that errors of all 
types are multiplied by at most e*”'“? ~ 7500 over one time period. It is not too 
difficult to improve this to e*”'' ~ 1000, and one could improve it further. But 
one could not improve it very much further. 

Consider for example the completely unavoidable error caused by the computer’s 
inability to handle numbers with infinite precision. If it handles numbers to 16 
significant digits, you may think you are starting at a saddle point, but your initial 
error (the distance between the saddle point and where you really are) may be as 
great as 10°'°. The largest eigenvalue A of the linearization of P at the fixed 
saddles in the Q, is about 321 (according to the computer). As long as you are in 
the region where P is approximately its linearization at this saddle, errors of all 
types are expanded by a factor of A over one time period, and hence ’” over m 
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time periods. So after m iterations the error will have mushroomed to 107 '°(321”): 
for m = 7 the initial minute error will have grown to 35. But already for an error 
of 1, you will have been booted out of the region where the linearization is a 
reasonable approximation to reality. 

Thus no numerical method can guarantee even one digit of accuracy after six 
time periods, if we are computing with 16 significant digits. In fact, the reality is 
much worse than that, and I wouldn’t trust anything after four time periods 
without some good reason. 


12. A POSTERIORI BOUNDS. Good reasons to trust solutions are available: I 
advocate extrapolation, as described in [8, Chapter 3]. At the moment, this works 
only for fixed step-length, but for a Poincaré mapping of a differential equation, 
fixed step-length is probably best anyway. For other possible methods, consult [10]. 

Denote by u,(t) the numerical approximation to the solution of some differ- 
ential equation given by the standard fourth order Runge-Kutta method, with 
u,(0) = a. Then the theory asserts that for each fixed t the approximation u,(t) 
converges to the value of the solution u(t), and that we have an asymptotic 
development 


u,(t) = u(t) + Ch* + o(h*). 


The exponent 4 is a feature of this approximation procedure; other procedures 
have different exponents. 

If for some A we know u,(¢), u, ,(¢), and u,, ,4(¢), and we assume that we have 
an asymptotic development of the form u,(t) = u(t) + Ch* + o(h*) for some k, 
we can extrapolate the values of k and of C from the values of the approximate 
solutions: 


1 


= U,(t) — Uy, /2(t) 
log 2 


Up 2(t) — Up g(t) 


2° y(t) = Uy /2(t) 


and Fr pF 


Now suppose we calculate u,, ,.»(¢) for a range of values of m, focusing on the 
expression for k above. The theory says that as m increases, the value of k should 
approach 4, but that doesn’t take round-off error into account; typically the value 
of k approaches 4 as m increases, then veers away from 4 as round-off error takes 
over. If there is a range of values of m where k is close to 4, the approximation is 
happening the way the theory predicts, and we can probably trust the correspond- 
ing estimate of the error. The following data illustrates this for our differential 
equation, solved for 0 < t < 167, 1e., for 8 periods. We start with the two initial 
positions (7.15859, 0.14097) and (7.16859, 0.14097). The extrapolations we find are 


first solution second solution 
steps order error order error 
6 

12 22.45 86.22 

24 3.07 2.67 1.05 41.48 

48 — 1.79 9.31 2.61 6.77 

96 3.26 0.96 —0.15 6.84 

192 — 0.44 1.31 — 1.09 14.64 

384 — 2.01 S22} 3.02 1.80 
768 — 0.06 5.48 4.96 0.057 
1536 5.13 0.16 4.19 0.003 
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Thus, the first approximation never becomes reliable; the order is never close 
to 4. In particular, there is no reason to think that the quantity in the “error 


29 


column is actually an estimate of the error. But the second appears to be 
converging nicely, with the order approaching 4, and probably the error estimate of 
0.003 is reliable. Thus although any estimate we make a priori for a bound for the 
error is bound to be wildly pessimistic, after the computation we can make a good 
guess as to how reliable it is. 


13. QUESTIONS AND OBSERVATIONS. 


(1) Are there any periodic sinks other than the attracting fixed points we 
found? I have no idea how to attack this problem. For one thing, I don’t 
trust computer drawings on this point: in many instances I eventually 
found sinks whose basins were too small to be visible on computer 
drawings unless you knew where to look. For another, the answer might 
depend in the most delicate way on the parameters: there definitely are 
other attracting fixed points when the forcing term is 1.22 cos t instead of 
cos t; for example, there is a sink of period 3, where solutions go from the 
point with coordinates x = —1.29785, y = 1.0025 to the point x = 
— 1.3349, y = —0.21286, to the point x = —3.004469, y = 0.17586, and 
then back to the first point... . In fact, with those parameters there are 
at least two more sinks of period 3, in addition to all the translates of the 
three sinks by 277. 

This problem may be unsolvable. John Milnor’s candidate for the 
simplest unsolvable problem of mathematics is the question: “Does the 
polynomial x* — 1.5 have an attracting cycle?” Of course, if it does, one 
can find it with a finite amount of work. But if it doesn’t, there may be no 
proof of this fact. 


(2 


~ 


Is the complement of all the basins B, of measure 0? This would mean 
that with probability 1 every initial point is attracted to a sink. I think this 
is the case, but have no solid grounds for this belief. Even the computer 
isn’t very definite, and besides, this is one point where numerical error 
might really be important: the perturbations of the period mapping due to 
errors of integration and round-off might affect the probability of being 
attracted to a sink. 
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The Hyperbolic Pythagorean Theorem 
in the Poincare Disc Model 
of Hyperbolic Geometry 


Abraham A. Ungar 


Sometime in the sixth century B.C. Pythagoras of Samos discovered the theorem 
that now bears his name in Euclidean geometry. The extension of the Euclidean 
Pythagorean theorem to hyperbolic geometry, which is commonly known as the 
hyperbolic Pythagorean theorem (see [3, 5, 6,9-11]), does not have a form analo- 
gous to the Euclidean Pythagorean theorem, so some authors have concluded that 
a truly hyperbolic Pythagorean theorem does not exist. For example, Wallace and 
West assert “the Pythagorean theorem is strictly Euclidean” since “in the hyper- 
bolic [Poincaré disc] model the Pythagorean theorem is not valid!” [15]. We show 
that a natural formulation of the hyperbolic Pythagorean theorem does exist: it 
expresses the square of the hyperbolic length of the hypotenuse of a hyperbolic 
right angled triangle as a natural “sum” of the squares of the hyperbolic lengths of 
the other two sides. 

The most general Mobius transformation of the complex unit disc D = 
{z:|z| < 1} in the complex z-plane [2, 4, 8], 

ig 0 4 


aay Aree toes tee, oO Q 1 
Z a eae e'"(Z) © Z) (1) 


defines the Mobius addition © in the disc, which allows the Mobius transforma- 
tion of the disc to be viewed as a Mobius left translation 
eae 4 


L252 


Z>Z%Oz= 


followed by a rotation. Here 6€ R is a real number, z), © D, and Z, is the 
complex conjugate of z,. A left Mobius translation is also called a left gyrotransla- 
tion [13]. Left gyrotranslations occur frequently in hyperbolic geometry [7, p. 55]. 
and are sometimes called hyperbolic pure translations [9, p. 224]. 

The prefix gyro that we use to emphasize analogies stems from the Thomas 
gyration, which results, in turn, from the abstraction of the relativistic effect known 
as the Thomas precession [13,14]. The relevance of the Thomas precession to 
hyperbolic geometry is not unexpected [9, p. 251] since this geometry underlies 
relativistic velocities. The sensitivity of Thomas precession to the non-Euclidean 
nature of the geometry of spacetime has attracted NASA’s interest in measuring 
the Thomas precession of gyroscopes of unprecedented accuracy in Earth orbit; 
see http: //einstein.stanford,edu. 
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The Poincaré hyperbolic distance function in D is [2] 


=|a 0 bl, (2) 


b 

1 — ab 

where we use the obvious notation a © b =a ® (—b) for a, b € D. It satisfies the 
Mobuus triangle inequality 

d(a,c) <d(a,b) @d(b,c), (3) 


which involves the Mobius addition © of two real numbers in the complex unit 
disc D. We prove (3) after the proof of our main theorem and a discussion of some 
relevant group theoretic properties of M6bius addition. The right hand side of (3) 
can be written as 


d(a, b) -| sae 


tanh(tanh~' d(a, b) + tanh~' d(b,c)) (4) 
so that the Mobius triangle inequality can be written as an inequality 
tanh~' d(a,c) < tanh~' d(a,b) + tanh™' d(b,c) (5) 


that involves the ordinary, rather than the Mobius, addition of real numbers. The 
hyperbolic distance function in D is commonly defined in the literature by 
[7, p. 53] 


h(a,b) = tanh™! d(a,b) = In psec) (6) 
2 1-d(a,b) 
rather than by d(a, b) in which case we have in the triangle inequality 
h(a,c) <h(a,b) + h(b,c) (7) 


for all a,b,c € D. The complex unit disc with its Poincaré distance function, 
called the Poincaré disc, gives the Poincaré disc model of hyperbolic geometry, in 
which geodesic lines are circular arcs that intersect the boundary of the disc 
orthogonally [3]. 


Theorem. (The Hyperbolic Pythagorean Theorem) Let Aabc be a hyperbolic 
triangle in the Poincaré disc, whose vertices are the points a, b and c of the disc and 
whose sides (directed counterclockwise) are A= —b ®c, B= —-c ®a, and C= 
—a ® b. If the two sides A and B are orthogonal, then IA|* @ |B\? = |c|*. 


Proof: Let Aabc be any hyperbolic triangle whose vertices are the points a, b, and 
c of the disc, and whose sides, A, B, and C, are geodesic segments that join the 
vertices, as shown in Figure 1. The measure of the hyperbolic angle between two 
sides of a hyperbolic triangle is given by the Euclidean measure of the angle 
formed by Euclidean tangent rays [3]. A hyperbolic right triangle is a hyperbolic 
triangle one of whose angles is 7/2. Furthermore, let Aabc be a hyperbolic right 
triangle whose sides A and B are orthogonal. Its right angle can be moved to the 
center of D by an appropriate Mdébius transformation (1) such that its two 
orthogonal sides lie on the real and on the imaginary axes of D, as shown in 
Figure 1. Mobius transformations of the disc preserve both the hyperbolic length 
of geodesic segments and the measure of hyperbolic angles. Hence, the resulting 
triangle Aa'b'c’, obtained by moving Aabc as shown in Figure 1, is congruent to 
Aabc in the sense that the two triangles Aa'b'c' and Aabc possess equal hyper- 
bolic lengths for corresponding sides and equal measures for corresponding angles. 
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Ic? =| AP @|BP 


Figure 1. The Hyperbolic Pythagorean Theorem in the complex unit disc. The square of the hyperbolic 
length of the hypotenuse of a hyperbolic right triangle equals the Mobius sum of the squares of the 
hyperbolic lengths of the other two sides. Furthermore, sin a = y|A\/(y-|C|) and sin B = 
ye Bl/Cyc ICD. 


The vertices of the relocated hyperbolic right triangle Ad'b'c’ are a =x, 
b' = iy, and c’ = 0, for some x, y € (—1, 1). The hyperbolic length of the geodesic 
segment joining two points a and b of the disc is d(a, b) = |b © a|. Accordingly, 
the hyperbolic lengths of the sides A, B,C of the triangle Aa’b'c’ are |A|, |Bl, 
and |C| given by 


JA|’ =|b' oc’ =y?, 


IB) =|a’ oc'|’ =x?, and (8) 
wo PZ 
CP =|¢ oP [x0 yf =| == =x’ @y’. 
Hence 
|A|’ @|BI’ =|CF, (9) 


which verifies the hyperbolic Pythagorean theorem for hyperbolic right triangles in 
the Poincaré disc. is 


The Hyperbolic Pythagorean Theorem is not an isolated analogy with Euclidean 
geometry; analogies between the Poincaré disc model of hyperbolic geometry and 
Euclidean plane geometry abound in gyrogroup theory {12]. It is shown there that 
the Mobius addition, ©, is analogous to the common vector addition, +, in 
Euclidean plane geometry. If we define 


a@®b 1 + ab 
bea 1+4ab’ 


gyr[a;b] = (10) 
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then gyrla;b] has modulus 1 and for all a,b,c € D the following group-like 
properties of © can be verified by straightforward algebra: 


a ® b= gyrla;b\(b @ a) Gyrocommutative Law 
ae(b@c)=(a eb) © gyr[a; ble Left gyroassociative Law 
(a®b)e@c=a 6 (b @ gyr[b; alc) Right gyroassociative Law 
gyr la; b] = gyrla ©@ b; b] Left Loop Property 
eytla; b] = gyrla;b @ a] Right Loop Property 


A resulting geometrically important identity, also verifiable by straightforward 
algebra, is [12] 


(x ®a) O(x @b) = gyr[x,a](a ob) (11) 

for all a, b, x € D. Taking the modulus of each side of (11) gives 
d(x @a,x ®b) =d(a,b), (12) 
which shows that the Poincaré distance function (2) is invariant under Mobius left 


gyrotranslations. 

To verify the Mébius triangle inequality (3), let y, = (1 — |a|*)~!” for any 
a © D. Then y, = y,4, is a monotonically increasing function of |a| that satisfies 
the useful identity 


Yeoh = WwYpl1 + ab| (13) 


for all a,b € D [I, p. 2], as one can verify by squaring both sides. 
It follows from (13) that 


Vilaleloll = Viaje ior = YyapYo/(2 + lal lb1) = well + 4b] = Yeo = Yjaes)- 


(14) 


Since ||a| @ |b||= |a| @ |b|, and since y, = Y\2; 18 a monotonically increasing 
function of |z|, the inequality in (14) implies the inequality 
la| ®|b] =|a eb (15) 


for all a,b € D. 
Replacing x by —x in (11), and noting that —(—x © b) =x © b, we have 
(—x ®a) ®@(x Ob) = gyr[-x,a](a eb) (16) 
for all x, a,b € D. Finally, (16) and (15) imply 
d(a,b) =|a © b| =|gyr[—x,a](a 9 b)| =|(-—x @ a) © (xO D)| 
<|—x @a| @|x 6 b| =d(a,x) © d(x, b) 
for all a, b, x € D, which proves the Mobius triangle inequality (3). 
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Is the Composite Function Integrable? 


Jitan Lu 


It is well known that the composition of two continuous functions is continuous 
and hence Riemann integrable. However, the composition of two Riemann inte- 
grable functions may or may not be Riemann integrable. For example, let 


1 when y #0, 


= QO when y = 0, 


QO when x is an irrational number, 


1 ee 
— when x = +, where p and gq are two coprime integers. 
P P 


Then 


QO when x is an irrational number, 
feg(x)= 11 whenx= : where p and g are two coprime integers. 


Both f and g are Riemann integrable on [0,1], but the composition f° g is not. 
Therefore, it is natural to ask whether the composition of two functions is still 
Riemann integrable, when one is Riemann integrable and the other is continuous. 

In what follows, we let f be a function defined on the interval [a, b], and let g 
be a function defined on the interval [c, d] with its range contained in [a, b]. 


Question 1. If f is continuous on [a, b] and g is Riemann integrable on [c, d], is 


the composition f° g Riemann integrable on [c, d]? 
The answer is yes. 
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Since f is continuous on the closed interval [a, b], it is uniformly continuous on 
[a, b]. Hence, for each « > 0, there exists a 6 > 0, such that for any €, and é, in 
[a, b] with |€, — &,| < 6 we have 


Moreover, f is bounded on [a, b]; say, fy) < M for all y € [a, b]. 

Since g is Riemann integrable on [c, d], for the above 6 > 0, there exists an 
7 > 0 such that for any division T of [c,d] with norm |7| < n, the following 
relation always holds: 


(1) 


yw Ax, <—, (2) 


where Ax, is the length of the interval J, in the division T and 
w, = max {|g(x) — g(y)|} 
x,yel, 


is the oscillation of g on J,. We recall that the norm |T| is the maximum length of 
the intervals in 7. 

Now we consider the composition f° g. For the division 7, let M, be the 
oscillation of f° g on I,. Divide all the intervals of the division T into two parts. 
The first part contains all the intervals on which the oscillation of g is not less 
than 6, and the second part contains the rest of the intervals. Then we have 


M,Ax,= ) MAx,+ dL M,Ax;. (3) 


a w= 6 w;,<d 


From (1), we know that for any interval J, in the second part, M, < e/2(d — c). 
Thus 


E E 
M,Ax, < i) 4 
» a [= “ 2(d —c) 2 oe 
but 
LAr, = Le wAx,>d 2) Ax, (5) 
w= 6 w= 6 
Combining (5) with (2), we obtain 
E 
Ax;< —. 
Lb Ans oy 


wo; 26 


Then 
dy MAx,<2M- di Ax, <2M- 


w= 6 w= 6 


Combining (3) with (4) and (6), we have 


4M 2° (6) 


ro rom 
M,Ax,<=+-== 


That is to say, f° g is Riemann integrable on Ic, d]. 
Thus we have proved the following result, which can also be found in [1, p. 197]. 


Proposition 1. If f is continuous on [a, b| and g is Riemann integrable on [c, d] with 
its range in (a, b], then f © g is Riemann integrable on |c, d]. 
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Question 2. If f is Riemann integrable on [a,b] and g is continuous on [c, d], is 
f° g always Riemann integrable on [c, d]? 
The answer is negative, as shown by the following counterexample. Let 


QO when y = 0, 
Ia 


when y # 0, 


on [a, b] = [0,1], and define g inductively as follows. 

First, let g)(x) = 0, x € [0, 1]. Next, construct g, based on gy. Divide [0, 1] into 
three sections, say, I,, J,, I; in proper order, such that the centre of J, is 5 and 
the length of J, is =. Modifying the function g, on I, appropriately, we obtain a 
function g,, that satisfies the following conditions: 


¢ g(x) = g,(x) for x in J, and J,; 

¢ g, is continuous on [0, 1]; 

¢ g(x) is always greater than zero for any x in the interior of J,; 
¢ the maximum value of g, on J, is 5. 


Once g,_, is defined, we construct g, as follows. First, divide all the intervals 
on which g,_, is always zero into three sections, such that the centre of the 
middle section is the centre of the original interval and the length of the middle 
section is 1/3"-2”"~'. Second, modify the values of g,_, only on the middle 
sections of them and obtain a function g,,, such that g, is still continuous on [0, 1], 
but in the interior of each modified intervals, g, is always greater than zero and 
the maximum is 2”. We note that there are 2"! intervals in which g, and g,_, 
have different values. Thus the total length of them is 3~”. 

Continuing this process gives a sequence of functions {g,} that satisfy the 
following conditions: 


¢ gis continuous on [0, 1]; 


1 


¢ lo (x) — g,_,(x)| < —, for any x € [0, 1]; 


Ph 7 
¢ the total length of all the intervals in which g, is not zero is 
F 1 1 1 1 , 1 
=-+tay54¢e°4+—= -]/1-—|]. 
ae <3 BP oS 2 | a | 


Thus, for any positive integers n > m we have 


1En(X) — 8m(*)| <|8n(*) — 8p-1€%)| + +1 8m4i1(%) — 8n(*)| 
1 1 1 
< aR aes amtt < 5m 


For any ¢ > 0, there is a positive integer N, say N > In e '/In 2 when e < 1. 
Then for any integers n > m > N, we have |g,(x) — g,,(x)| < 2°” < e for any 
x € [0,1]. That is to say, g,(x) is uniformly convergent on [0,1]. Let g,(x) be 
uniformly convergent to g(x) on [0,1]. Then g satisfies: 


¢ g is continuous on [0, 1]; 
¢ g(x) is not identically zero on any subinterval of [0, 1]; 
¢ the total length of all the intervals in which g(x) is not zero is 


1 1\" 1 
S= lim -~{1l-{- a tee 
now 2 3 2 
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We now prove that f° g is not Riemann integrable on [0, 1]. 

Let T be a division of [0,1]. Divide T into two parts. The first part 7, 
contains all the intervals in which g(x) is non-zero and the second part T, 
contains the rest. The total length of all the intervals in T, is at most 3; hence 
the total length of all the intervals in T, is at least $. But in any interval J, of 
T,, we can always find two points €, and ¢ such that g(é,) = O and g(Z) # 0. 
Obviously, f° g(é,) = 0 and f° g(Z) = 1. Thus the oscillation M, of f° g on J, 
is 1. 

Let M, be the oscillation of f° g on any interval J, of JT, and Ax, be the 
length of the interval /,. Then 


NN] Re 


MAX, = L)MjAx, + )IMAx, = DI M,Ax; = Ax; = 
oY T, je ie i 


Thus f° g is not Riemann integrable on [0, 1]. 


The discussion can be continued by asking for conditions on g to ensure that 
fog is Riemann integrable, provided that f is Riemann integrable. The following 
result provides one answer to this question. The proof is left to the reader. 


Proposition 2. Let f be a Riemann integrable function defined on [a, b] and let g be a 
differentiable function with continuous and non-zero derivative on [c, d]. If the range of 
g is contained in [a, b], then f © g is Riemann integrable on [c, d]. 
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On the Generalized ** Lanczos’ 
Generalized Derivative” 


Jianhong Shen 


This short note is an extrapolation of Groetsch’s interesting article [1], and may 
lead to a clearer understanding of Lanczos’ derivative. Only a minimal familiarity 
with random variables is required. 

Lanczos’ generalized derivative is defined by 


3 
Dif) = 3 [a+ 0 at 


where / is a parameter that can be assumed positive. It generalizes the ordinary 
derivative in the following two senses: 


(1) Suppose f(x) is locally C* at x). Then D, f(x)) = f’(%)) + O(n’). 
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(2) Suppose f(x) has both the right and left derivatives f;,(x) and f;(x) at Xp. 
Then | 


fr(%o) + fil %o) 


lim D = 
ps6 nt (Xo) a) 


(1) 


A few things puzzled me as I read [1]. First,-what does the coefficient (3 /2h°) in 
the definition really mean? Second, how can one see easily from its integral 
definition that D, is like a derivative? And finally, how exactly are the right and 
left derivatives involved in the limiting process of (1)? These questions gave rise to 
this note. 

Let X be a bounded symmetric continuous random variable (i.e., X and —X 
have the same distribution function) with variance 1. For example, X might be 
uniformly distributed on [— 73, V3 ] (with mean 0 and variance 1). 

Recall that the ordinary finite difference operator d, is defined by 


x+h)—f(x 
isfta) - EPH =F) 


For any positive number o,, define 


Lef(*) = E{X7d,yf(x)}, 
where EF is the expectation operator. 
The motivation is simple. If o@ is very small, Y=oX _ behaves like an 
atomic distribution at the origin. Therefore, one can pretend that X and Y are 
independent: 


L f(x) = E{X*}E{dy f(x)} = Eldyf(x)}. 
This is an averaged d,! Hence, L, does resemble the ordinary derivative for 
small o. 
Moreover, L, generalizes Lanczos’ derivative D,. To see this, take X to be any 
random variable that is uniformly distributed on [— V3, V3]. Define h = V3.0. We 
~ show that L, = D,: 


xX 1 
Lyf) = EL f( + 6X) -_ = —B(Xf(x + 6X) 


=f" gat t)—= aft 
oJ 5 2h J 5 


3 
= ri sf(x +s) ds =D, f(x). 


We now understand that the mysterious coefficient 3/2h° has evolved from the 
simple parameter o after such a long journey! 
A rigorous error estimation for L, f(x) follows. If f(x) is C? near x,, then 


ie 0) 


x+ dt 


i 


d,xf(%o) =f'(%o) + o«X+O(a’*) a o> 0. 


The error term bound does not sd on the samples of X since we have 
assumed that X is bounded. Therefore, 


at 0) 


Cre Bl x f(a) + Pox? + X70(07)| = f'(x,) + O( 0). 
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Notice that E{X°} = 0 since X is symmetric. This extends the first property of 
Lanczos’ derivative. 

The second property of Lanczos’ derivative generalizes to L, in a similar 
fashion. Assume that both f,(x,) and f;(x,) exist. Then 


L f(x) = E{X*d,yf(xo): X > 0} + E{X7d,y f(x): X < 0} 
= E{X*fe( Xo) + X70(1): X > OF + E{X7 Fi (x9) + X70(1): X < 0} 
= E{X*fp( x0): X > 0} + ELX7f}( x9): X < 0} + 0(1) 
= fp( Xo) E{X*: X > O} + fi(xp) EL X?: X < 0} + 0(1) 


_ fr( Xo) a ei: 


In the last step, we have applied the symmetry condition and E{X’*} = 1. The 
roles of fp and f; are seen clearly from these five lines. 

Finally, notice that: (1) If f(x) is Lipschitz continuous at x, with L as its 
Lipschitz constant, then |L, f(x,))| < L; (2) The random variable involved can be 
replaced by any suitable distribution with a compact support, since we have not 
used the positivity condition. 
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A Stability Theorem 


Walter Rudin 


In 1968 I proved a theorem (stated below) about zeros of holomorphic functions in 
a polydisc [2, p. 87] which was later, in [1], referred to, much to my surprise, as a 
“cornerstone” of multivariable stability theory. The authors of [1] pointed out, 
quite correctly, that my proof used quite a bit of homotopy theory, and they 
proceeded to prove the theorem by a sequence of more elementary steps. The 
present note contains an even easier proof, which is also much shorter, and which 
relies only on very simple properties of the index (or winding number) of a plane 
curve around the origin. 

The following notation will be used. C is the complex plane, C* = C \ {0} is the 
set of all nonzero complex numbers, U and U are the open and closed unit discs in 
C, respectively, and 7 is the unit circle. For n > 1, 


C’=CxX:xc, U"=UxsxU, T° =TX+xXT; 


each of these cartesian products has n factors. The torus 7” is the so-called 
distinguished boundary of U"; it is a small (n-dimensional) part of the whole 
(2n —1)-dimensional boundary of the polydisc U”. 

A(U") is the class of all continuous f: U" > C that are holomorphic in U". 
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If now [: [0,27] — C* is continuous and [(277) = FO) (so that F((0, 27 ]) is a 
closed curve in C*) then there exists a continuous real-valued function a on 
[0,27] such that 


(0) =|I'(0)| exp{27ia(6)} (0 < 0< 27). (1) 


Since (277) = P'(0), a(27) — a(O) is an integer (positive, negative, or 0). This is 
the index of I: 


Ind T = a(277) — a(0). (2) 


Note that Ind I is independent of the particular choice of a.) 

We need the following properties of the index. 

(I) Suppose (s, 6) — I.(@) is a continuous map from [0, 1] x [0,27] into C*, 
and I\(27) = (0) for all s. Then Ind I, is the same for all s. 

The reason is simply that Ind [, is a continuous function of s. Being integer- 
valued, this function is constant on the connected set [0, 1]. 

(ID) If G:U > C®* is continuous and if we define G\7(0) = G(e'® (0 < 6 < 27) 
then Ind G|7 = 0. 

To deduce this from (1) put I,(6) = G(se’®) and note that Pr, = Gl7,Ty is the 
~ constant G(0). 

Il) [fh € AU) and h(T) c C* then Ind h|rp is equal to the number of zeros of h 
in U, 

This is the classical “argument principle” of complex analysis. 


Theorem. Suppose ® = (@,,..., ¢,) is a continuous map of U into U" that carries T 
into T”, such that 
Indglr>O for 1<j<n (3) 
Put K = ®(U). Then 
f(T" UK) = f(U") (4) 


for every f € AU"). 


Proof: Assume f(z) # 0 for every z € T” UK. We show that f(z) # 0 for every 
z € U". This implies the theorem, and shows why the term “stability” was used in 
this connection. _ 

Fix a = (a,...,a,) © U". Let Indg|7 = m;. There exist c; € C such that 


cMi=a, (l<j<n). (5) 


\ 


for A € U. Then h € A(U), K(T) C C*, h(O) = f(a). Hence f(a) # 0 follows from 
Ind hlr = 0 (7) 


Since m, > 0, |c;| < 1. Define 


my, 


A+ Cy 
L+c,A 


AG: 
1+c,A 


5 8 8 8-8 


h(A) =f 


because of (III). _ 
Since (f° ®)\(U) = f(K) C C*, by assumption, (II) shows that 


Ind fo @|; = 0. (8) 
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There are continuous real-valued functions a,, 8B; such that 


ae oe ; ef C, mj; : a ; 
pe y= exp{ Trias, )}, is =e = exp{ Tri; ( )} 
on [0,27]. Note that 
a(27) — a(0) =m, = B27) — B (0). (9) 


Define 
y;, (0) =sa(0)+(1-s)B(0) (O<s<1,0<6< 27) (10) 


and let W,: [0,27] -— T” be the map whose j component is exp {27iy, ,(@)}. 
Then 


W(27) =WV,(0) (0<s <1), (11) 

W(T) CT", hence fCV(T)) Cc C*, and now (I) shows that 
Ind fo W, = Ind fo W. (12) 
Since fo ®|; = foW, and h|r = f° Vp, (12) and (8) imply (7). ca 


Remarks. (i) The simplest example of a ® as in the theorem is ®(A) = (A, A,..., A). 
Then K is a disc (2-dimensional), dim(T” U K) =n, whereas dim U” = 2n. 

(ii) It is not necessary for ® to map U into the interior U" of U". For example, 
when n = 2, 


(2re'’, 0) (0<r<1/2) 


D(re’’) = | 
(e'°,2r—1) (1/2<r<1) 
will do nicely. 

(iii) The hypothesis “m, > 0 for all j” cannot be omitted. To see this, take 
n = 2, ®(A) = (A, A). Then m, = 1, m, = —1.If f(z,w) = 1+ 42m, then |f| > 1 
on T* U ®(U) but f(4, — 4) = 0. 

For another example, take ®(A) = (A, 1), so that m, = 1, m, = 0, and put 
f(z,w) = 2w — 1. Then |f| > 1 on T? U OCU) but f(z, 1/2) = 0 for all z. 

However, the hypothesis “m, > 0 for all j” can be replaced by “m, < 0 for 
all 7” because the theorem can then be applied to ®(A) in place of P( d). 
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Rationals and the Modular Group 


Roger C. Alperin 


The modular group @ is the quotient group PSL,(Z) = SL,(Z)/{+]} of SL,(Z), 
the group of 2 X 2 integer matrices of determinant 1. In [1] we gave an elementary 


proof that @ has the structure of a free product of a cyclic group of order 2 
0 -1 


generated by the image of A = (‘ 0 


~1 
|: 

The free product structure provides a description of the non-trivial elements of 
WM as unique strings of A’s and B’s with the property that there are no two 
consecutive A’s and no three consecutive B’s; we refer to these as reduced Strings. 
We explained this free product structure in terms of the action of the modular 
group on the irrationals. In this note we describe the action on the rationals; this 
can be viewed as a way of describing the inverse of the Euclidean algorithm. 

The group SL,(Z) acts via linear transformations on R* as column vectors and 
this gives an action of @ via linear fractional transformations on the projective 


line P'(R), the real numbers together with %. We may also view P'(R) as the 
0 
0 


non-zero scalar multiplication; the equivalence class of the vector e = 5 iS 


denoted ™, the equivalence class of the vector i } q # 0 is the same as that of 


and a cyclic group of order 3 generated by 


the image of B = (° 


slopes of non-zero vectors, that is, the equivalence classes of R? -( induced by 


PL "| and corresponds to the real number z = p/q. For the matrix ? ] in 


d 
SL.(Z) the induced action on P'(R) is given by 
az +b 
cz +d_ 


— 


The induced action for the generating elements is given as 
= =] 1 
A:z —~—, B:z —~ —, B’*:z—>-1--. 
ee Zar A Z 

The orbit of e is easily seen to be in correspondence with the set of all first 
columns of matrices from SL,(Z). Thus the orbit of © is in 1-1 correspondence 
with the projective line P'(Q), consisting of the set Q of all reduced fractions 
together with ~; from elementary group theory this is in 1-1 correspondence with 
the set of left cosets of the stabilizer / of e, which is the image of the subgroup 
generated by AB in Z. 

Using the free product description of 4 we can also describe the set of coset 
representatives as reduced strings of A’s and B’s . First, a non-trivial coset 
representative cannot end in AB or its inverse B*A; therefore if it ends in A it is 
either A or of the form ZBA with Z ending in A or trivial; if it ends in B it is 
either B or ZB* with Z ending in A or trivial. Thus, as a first pass, the set of coset 
representatives is the set # = {J} U {A} U {B} U {B4} U (B*} U (ZBA|Z any 
string ending in A} U {ZB’|Z any string ending in A}. Next, to determine the 
distinct coset representatives we just observe that the free product description 
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gives a unique expression for the elements. The coset equivalence relation X = Y 
on reduced strings X,Y €.aH is Y= X(AB)" or Y = X(B’A)” for some non- 
negative integer n. We see easily that A = Bmod Y and hence also XA = XB 
mod Y for any string X, and hence if this is reduced, X # J must end in B. Thus 
we can simplify the description of the distinct coset representatives to & = 
{I} U {B} U {B*} U {ZB?*|Z any string ending in A}. It is easy to see that no two of 
these reduced strings are equivalent; for example, for Z # W, both ending in A, 
then ZB* = WB*(B’A)" and ZB? = WB’(AB)" are impossible. Thus the coset 
representatives of W/V are the distinct strings A = {J} U {B} U {B*} U {ZB?|Z 
any string ending in 4}. 


We can also describe this set & as the union of &,, defined inductively as 
Ky, = {I, BY, M = {B’} 


a 


m 


ss { ALY, m+1 — {B’, B}F,, (1) 
Rms) TE yy, NF OM yi 


It is easy to see that #, has 2”*!' elements and #,, .%, each have 2” 
elements. We can rewrite (1) as 


Ad 


m+ 


1 = {AB, AB} F,, Maat = {B°A, BA)Y,,. (2) 


Simplifying (2) we obtain the following result. A/(x, y) denotes the free semi- 
group with the generators, x, y. 


Proposition. P'(Q) is in 1-1 correspondence with & = {I} U {B} UFSACAB, AB’): 
AB’? U FA B*A, BA) - B’. 


Observing, that Ico = «, Bo = 0, B’o = —1and AB*o = 1, and P'(Q) — {co, 0} 
is the positive and negative rationals, we have the following 


Corollary. The set of positve rationals is the orbit of the free semigroup generated by 
AB and AB* on z = 1. The set of negative rationals is the orbit of the free semigroup 
generated by B*A and BA onz = —1. 


These upper U = AB, U-=B’A and lower L = AB’, L~ = BA triangular 
matrix actions corresponding to these semigroup generators are 


Zz 
U:z—-7z4+1, Liz- : 
z+1 

Zz 
Ui:z-7z-1b:iz- : 
(4 


Every positive rational is uniquely expressible in terms of semigroup generators 
as an element of the orbit of 1. Alternatively, starting from a reduced positive 
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rational z = we can apply a greedy or Euclidean recursion to obtain a finite 


sequence that stabilizes at 1; 


(p-q) . 
=e peg 
" q 
q if p <q 
qd SSS 1 
(q-p) 
1 ifp=q=1 


Here we apply U" or L”~ depending on whether or not z > 1 or z < 1. For 
example, the sequence 


corresponds to (U “L-¥Q) = 1. Since we have shown that the set of positive 


rationals can be described by a free semigroup, this means that (LU)*L is the 
coset representative in &, corresponding to 34/55 as described in the Corollary. 


Figure 1. Zp. 7 | Figure 2. 29. 


Finally, we consider the matrix action of the distinct non-trivial coset represen- 
tatives in # on the column (;| and the plots of these images in R*. These images 
are just the points with relatively prime coordinates in the upper half-plane. We 


obtain the fascinating plant-like structures in Figures 1 and 2. For example, the 


0 ‘a - oo e s ( +34) (+55); 
distant points from the root (°| are the ‘Fibonacci points ( Be i ( a In By. 
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The Union of Vieta’s and Wallis’s 
Products for Pi 


Thomas J. Osler 


The beautiful infinite product of radicals 


Z } 1 1 1 /1 1 1 /1 1 /1 : 
— — —-+— — —+— —+— — eee 
T 2 Z ZN 2 2 2\ 2 2V 2 o 


due to Vieta in 1592 [2], is one of the oldest noniterative analytical expressions 
for zr. Wallis’s product dating from 1655 [3] 


) ; : ; 
= = ri? (2) 


is also most remarkable. Both are usually included in any list of interesting 
expressions for 7 [1]. 

The purpose of this short note is to call attention to the following union of 
Vieta and Wallis-like products: 


(n radicals) 


2 Ply es. OPEL 
x || 


QPetly, QPetiy 


(3) 


While (1) and (2) seem unrelated, they are both special cases of a more general 
double product (3). The first product in (3) consists of the first p factors of Vieta’s 
original infinite product (1). The second product in (3) is a Wallis-like product. We 
say this because the case p = 0 gives us Wallis’s original product (2), and for other 
values of p it is Wallis’s product with factors deleted. Notice also that the 
Wallis-like product in (3) provides us with the error factor needed to make the 
Vieta product (1) exact when only finitely many factors are used. 

Relation (3) yields Vieta’s product (1) when p goes to infinity, and Wallis’s 
product (2) when p = 0. For each intermediate value of p = 1,2,3,...we obtain 
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united Vieta-Wallis-like products: 
2, Les: 385 367 Be: OT 113 


i. OO. Bid 666 888: 10210 12-10 
(Wallis’s original product) 


s) 1 345° 750: 2s 15817 “19-591 
pe” 7 12:12 16-16 20-20 
2 SetT “28005: 31593 
a i5 5 16°16 94°04 39-39 
2 
p = 3: — 
VT 


BE 17 Naas 33 a y3 63 - 65 
16:16 32-32 48-48 64-64 


y2t2V2 S ied product) 


~) 
t 
8 


An examination of these special cases of (3) shows that each time we increase p by 
one, we insert one new radical factor in the Vieta-like product, and remove 
alternate factors from the Wallis-like product. The author accidently discovered (3) 
while trying to derive (1). 

To derive (3) we start by applying the double angle formula for the sine function 
p times to obtain 


6 
sin 0 = 2cos — sin — 
2 2 
6 6 6 
= 2° cos = cos = sin = 
2 2 2 


6 6 6 0 
= 23 cos — cos 


5 52 COS 3 sin Z 
0 6 0 6 6 
sin 0 = 2” cos 5 COs 52 COs 3 -COS aP sin aP (4) 


Next we use the infinite product for the sine function [4], valid for all x, 


[o6) 2 CO 
x TWH -X xXN+TX 
sinx =x | | eae 7) =x|| : 
wen wn wn 


n=1 n=l 


with x = 0/2” to replace the last factor in 4. ld: by 6 gives 


(5) 


= COS — COS COS * COS 


sin 6 6 6 6 0 = 20m — 8 2’arn + 6 
0 2 ye 2° QP 


2P?arn 2 Parn 
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Now express each of the cosine factors in (5S) in terms of cos @ by repeated use of 
the half-angle formula for the cosine; here we assume — 7/2 < 6 < 7/2 so that 
the cosines are never negative. 


0 1 1 
COS 5 = 5 + 5 cos @ 

0 1 1 1 1 ; 
COS 52 5 5V 5 5 COS 


0 
cOS Te (6) 
( p radicals) 
Combining (6) with (5) we obtain 
sin 0 1 
Q 0 2 
(1 radicals ) 
~(2?°an— 0 2?an+ 6 
x | |; —— -. —_- 7 
U | 2?arn 2?arn (7) 
If we set 6 = 7/2 in (7) and simplify we obtain (3). | 
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PROBLEMS AND SOLUTIONS 


Edited by Gerald A. Edgar, Daniel H. Ullman, and Douglas B. West 


with the collaboration of Paul T. Bateman, Mario Benedicty, Paul Bracken, Duane M. Broline, Ezra 
A. Brown, Richard T. Bumby, Glenn G. Chappell, Randall Dougherty, Roger B. Eggleton, Ira M. 
Gessel, Bart Goddard, Jerrold R. Griggs, Douglas A. Hensley, John R. Isbell, Robert Israel, Kiran 
S. Kedlaya, Murray S. Klamkin, Fred Kochman, Frederick W. Luttmann, Vania Mascioni, Frank 
B. Miles, Richard Pfiefer, Cecil C. Rousseau, Leonard Smiley, John Henry Steelman, Kenneth Sto- 
larsky, Richard Stong, Charles Vanden Eynden, and William E. Watkins. 


Proposed problems and solutions should be sent in duplicate to the MONTHLY 
problems address on the inside front cover. Submitted problems should include 
solutions and relevant references. Submitted solutions should arrive at that address 
before March 31, 2000; Additional information, such as generalizations and refer- 


ences, is welcome. The problem number and the solver’s name and address should 
appear on each solution. An acknowledgement will be sent only if a mailing label 
is provided, An asterisk (*) after the number of a problem or a part of a problem 
indicates that no solution is currently available. 


PROBLEMS 


10753. Proposed by Louis Shapiro, Howard University, Washington, DC. An ordered tree 
is a rooted tree in which the children of each node form a sequence as opposed to a set. The 
5 ordered trees with 3 edges are 


ALA 


The number of ordered trees with n edges is the nth Catalan number ee /(n+1). Therefore, 


if one draws each of the ordered trees with n edges, one draws a total of ee nodes. Prove 
that exactly half of these nodes are end-nodes (1.e., leaves with no children). 


10754. Proposed by Paul Bracken, Université de Montréal, Montréal, PQ, Canada. Let 
C(s) = oro, k-*, and let p(s,n) = )o¢,,,, k~*. Show that for positive integers s > 2, 


et) 


a 


— p(s,k) _ s Ws 
2. ; a da cine a 


=] 


10755. Proposed by Jiro Fukuta, Motosu-gun, Gifu-ken, Japan. An arbitrary circle O is 
drawn through vertices B and D of a convex quadrilateral ABCD. Let O; be the circle 
tangent to lines AB and AD and tangent to O internally at a point of O on the opposite 
side of line BD from A. Let O2 be the circle tangent to lines CB and CD and tangent to 
O internally at a point of O on the opposite side of line BD from C. Let R; and R2 be 
the radii of circles O, and O32, respectively, and let r; and rz be the radii of the incircles of 
triangles ABD and CBD, respectively. Prove that the quadrilateral ABC D is inscribable 
in acircle if and only ifr; /R, +7r2/R2 = 1. 
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10756. Proposed by Douglas Iannucci, University of the Virgin Islands, St. Thomas, VI. 
Prove that 


x 1 Jf ( (; 1 (; 1 )) 
cos —- = - + —— | COS | — arccos —— + J3 sin — arccos —— : 
7 6 6 3 G7 3 IT. 


10757. Proposed by Mark Kidwell, United States Naval Academy, Annapolis, MD. Given 
integers ag, a], d42,..., a, witha; # Ofori > 1, write [ap; aj, a2, ..., ay] for the continued 
fraction 


Every positive rational number has a unique representation as [ag; a1, a2, ..., Qn] if we re- 
quire thatag > 0,a; > Oforl <i <n—1,anda, > 1 (wecall this the standard representa- 
tion), but it can have other representations [bo; by, b2, ..., bm] if we permit negative values 
for some of the b; or if we permit b,, = 1. For example, 11/3 = [3; 1, 2] = [3;1,1, 1] = 
[4; —3]. Prove or disprove: If r is a positive rational number, r = [ag; a1, a2,..., an] 
is the standard representation, and r = [bg; bj, b2, ..., bm] is another representation, then 
agta,j+:+:+ay, < |bo|+|b1|+---+]b,,|, with strict inequality if any of the b; are negative. 


10758. Proposed by Mark Sapir, Vanderbilt University, Nashville, TN. Prove that the sum 
of the (decimal) digits of 9” cannot equal 9 when n > 2. 


10759. Proposed by Calin Popescu, Université Catholique de Louvain, Louvain-la-Neuve, 
Belgium. In triangle ABC, let h, denote the altitude to the side BC and let r, denote the 
exradius relative to side BC, i.e., the radius of the circle tangent to the extensions of sides 
AB and AC and to the side BC externally. Define hp, h,, rp, andr, correspondingly. Prove 
that hur! +hir, there <rir, trpr¢ +rérg for any integer n, and determine conditions 


for equality. 
SOLUTIONS 


Common Eigenvector of Commuting Matrices 


10633 [1997, 975]. Proposed by Kiran S. Kedlaya, Princeton University, Princeton, NJ. Let 
S be a commuting family of n-by-n matrices over an arbitrary field. Suppose the matrices 
in S have a common eigenvector v, so that Mv = Ayu for all M € S. Prove that the 
transposes of these matrices also have a common eigenvector with these eigenvalues, that 
is, a vector w satisfying M’ w = Ayw forall Me S. 


Solution by Alain Tissier, Montmermeil, France. Let K be the field. Set@(M) = M—-Ayl 
and @(S) = {@(M):M eé S}. Thus ¢(S) is a commuting family of n x n matrices over 
K having a common nonzero vector v such that ¢(M)v = 0 for all 6(M) € @(S). Since 
@(M)? = M! — Ay], we have to prove only that the transposes of the matrices in #(S) 
have a common nonzero vector w satisfying ¢(M)' w = 0 for (M) e o(S). Thus we 
may suppose that Ay = O for every M. 

If all matrices in S are nilpotent, then the collection of transposes is also a commuting 
family of nilpotent matrices. In this case there is a nonzero vector w such that M/ w = 0 for 
all M € S (section 3.3 of J. E. Humphreys, [ntroduction to Lie Algebras and Representation 
Theory, Springer-Verlag, 1972). So we may assume that not all elements of S are nilpotent. 

We proceed by induction on n. When n = 1 all the matrices are zero, so the conclusion 
is true. Taken > 1, and suppose the result is true for h-by-h matrices for eachh <n. Let N 
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be anonnilpotent element of S. Let W be the set of all vectors x such that N*x = 0 for some 
k > 0. By finite-dimensionality, there is a fixed k such that N‘x = 0 for all x € W. So 
v € W, W isasubspace, and K" = W @U, where U is the range of the mapping x H N*x. 
Now if M ¢€ S, then M commutes with N, and the descriptions of W and U show that they 
are invariant under M. Let m be the dimension of W, let B’ be a basis of W, and let B” be 
a basis of U. For each M é€ S, let M’ be the B’-representation of M restricted to W and let 
M” be the B”-representation of M restricted to U. Then there exists a nonsingular n x n 


/ 
matrix P such that P~'MP = ba for all M € S. Let S’ be the set of the matrices 


M’. Then S’ is a family of m x m commuting matrices having a common nonzero vector 
v’ such that M’v’ = 0 for each M’ € S’. By the induction hypothesis there exists a nonzero 


/ 
vector w’ such that M’? w’ = 0 for each M’ € S’. The vector (PT)! k solves the 
problem. 


Solved also by R. J. Chapman (U. K.), D. Huang, J. H. Lindsey II, G. Sansigre Vidal (Spain), GCHQ Problems Group (U. K.), and 
the proposer. 


Reflected Concurrent Lines 


10637 [1998, 68]. Proposed by C. F- Parry, Exmouth, Devon, United Kingdom. Suppose 
triangle ABC has circumcircle I’, circumcenter O, and orthocenter H. Parallel lines a, B, y 
are drawn through the vertices A, B,C, respectively. Let a’, B’, y’ be the reflections of 
a, B, y in the sides BC, CA, AB, respectively. 

(a) Show that a’, 6’, y’ are concurrent if and only if a, B, y are sails to the Euler line 
OH. 

(b) Suppose that a’, B’, y’ are concurrent at the point P. Show that I bisects OP. 


Solution by Robert L. Young, Osterville, MA. Take I to be the unit circle zZ = 1 in the 
complex plane and rotate ABC about O so that arg H = 0. Assume H # 0 for now, so the 
Euler line exists and is the real axis. Choose 63 > 62 > 6; > Oso that A = 2/1 B= ef, 
and C = e!%, and let M = e!", where @ € [0, z) is the angle of inclination of the lines a, 


B, y. 


(a) The reflection z’ of acomplex number z through the line containin yntaining Band C is deietmined 
as follows. Apply the linear transformation t(z) = (z — B) (C — (C — B), which takes B and C 
and therefore the line BC to the real axis. Since reflection in the real axis is conjugation, 


1c (z—B)(C—B) BC 
—=T Cl ae —+B=-B8C7+8-40C, 
(C—B) BC 
and the reflection of A through line BC is 
A’=-BCA+B+4C. (1) 
Any z # A’ ona’ satisfies the equation 
z—A’ 
Z—A’ 
Since the perpendicular bisector of line BC passes through O and exp(i (62 + 63)/2), we 
have arg(C — B) = (62 + 63)/2 — 2/2 modulo z. By the definition of a’, arga’ + arga = 
2arg(C — B) = 62 + 63 —m modulo 2m, so ce #Be = ei2O2+263-20) _ p2C274” 
Substituting (1) into (2), we conclude that a” has equation 


= ei arg ar’ (2) 


z=M C?B? (z+ ABC—B-C)—-BCA+B+C. 
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It is convenient to note that A + B+ C =H and is therefore real and to write K = ABC, 
so that AB + BC+ CA = KH = KH. With this notation, the equation becomes 


z= M K?A(Z+(A—C — B)BC) + (AB + AC — BC)A, or 

z= K(M KZ—2)A —(M° —1)KHA+2MK. 
Similarly, the equation of B’ is : 

z= K(M Kz—2)B —(M —1)KHB+2MK. 
Let zc denote point of intersection, if any, of a’ and fp’ and similarly for z4 and zg. 
Solving for zc from these two equations, we get K (M * KIC — 2A" —(M — I)KHA= 
K (M K%¢-2)B —(M -1)KH B,soK(A -B )(M Kz¢-2) = (A-B)(M -1)KH, 
and : 

(M’ Kz —2)(A+B) =(M — 1H. 
Similarly, 
WF Kz —2)(4+0) = Wea -2)(8 +0) =F -1)n 

Suppose a’, 6’, y’ are concurrent at P. Then (A+B) (M KP 2), (B+C)(M° KP — 2), 
and (C + A)(M °KP — 2) all equal (M — 1)H. Multiply the first of these equations by 


B-+C, multiply the second by A+ B, and then subtract to obtain 0 = (M fe 1) H (A — C). 
Since A # C and H + 0, we have M2 = 1 and 6 = 0. Soa, B, y are parallel to the Euler 
line as claimed. Conversely, if «, 8, y are parallel to the Euler line, then M* = 1, and 
ZA = Zp = ZC = P = 2K satisfy the equations for a’, 6’, y’, so these are concurrent. 

If H = 0, there is no Euler line. In this case, a’, B’, and y’ concur at P = 2K i. 
(b) Since P = 2K = 2ABC, we have |P| = 2. Therefore |(O + P)/2| = 1and(O+P)/2° 
ison’. | 


Solved also by J. Anglesio (France), M. Benedicty, N. Lakshmanan, and V. Schindler (Germany). 


A Constrained Maximization 


10646 [1998, 176]. Proposed by Hassan Ali Shah Ali, Teheran, Iran. Find the maximum 
of []/—,(1 — x;) over all nonnegative x1, x2,..., xn with }°_, Xe = |, 


Solution by Patrick A. Staley, Southwestern College, Chula Vista, CA. When n = 1, the 
constraint requires x; = 1, and the maximum value is 0. So assume n > 2. We show that 
the maximum is 3/2 — /2 © 0.0858, and it occurs when two of the x;’s are 1/./2 and the 
others are 0. | 

Let x1,X2,...,X, be an optimal solution. If x and y are any two of the x;’s, then they 
satisfy a two-element subproblem: maximize (1 — x)(1 — y) under the constraints x > 0, 
y > 0,andx*+ y” = k? fora given positive k < 1. To solve this, note that dy/dx = —x/y, 
SO | 

ASOD). inoayog gy = See E), 
ax dx y 

If this vanishes, then (x + y — 1)(x — y) = 0. There are three possibilities for the global 
maximum of (1 — x)(1 — y): 
(1) endpoints, x = 0, y = k (or vice versa), so (1 —x)(1 —y) =(1—&); 
(2) y=x,sox =y =k/V2, (1—x)(— y) = (1 —k/V2)5 or 
(3)y=1—x,sox,y =(1l4£V2k2 —1)/2and(1—x)(1-y)=(- k*) /2. 
Case (3) may be discarded, since (1 — k?)/2 < (1—k) forall k. Ifk < 2(/2 — 1) © 0.828 
then case (1) is maximal; otherwise, case (2) is maximal. 
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Now consider a three-element subproblem. Let x, y, z be any three of the x;’s. They 
maximize (1 — x)(1 — y)(1 — z) subject tox > 0, y > 0,z > 0, and x* + y2+ 2% = h? 
for a given positive h < 1. Now the largest element must be at least h/./3, so the other 
two.elements solve the two-element subproblem with k < /2/3h < 0.828, so for that 
subproblem case (1) is maximal, and thus one of the variables must be 0. 

Since one of every three variables must be 0, there can be at most two nonzero variables. 
Those two solve the two-element problem with k_= 1, so the maximum occurs in case (2) 
and the maximum is (1 — 1//2)* = 3/2 — V2. 


Editorial comment. There were a large number of incorrect solutions. Many of these used 
Lagrange multipliers to find a local maximum for the function in question, but ignored the 
possibility of a global maximum occurring at a-boundary point, as it does when n > 3. 


Solved also by R. A. Agnew, Z. Ahmed & A. N. Joseph & M. A. Prasad (India), R. Barbara, M. Benedicty, B. Borchers, P. Budney, 
R. J. Chapman (U. K.), C. Georghiou (Greece), G. Keselman, A. Kundgen, J. H. Lindsey II, S. Pedersen (Denmark), C. Popescu 
(Belgium), A. Rosenthal, W. J. Seaman, H. A. Steinberg, A. Stenger, J. Vandergriff, J. T. Ward, Q. Yao, GCHQ Problems Group, 
IUTS Problems Group, NSA Problems Group, and the proposer. 


A Polya-Szeg6 Exercise Revisited 


10650 [1998, 271]. Proposed by Zoltan Sasvdri, Technical University of Dresden, Dresden, 
Germany. For n > 2, let 


(n? + 1)(n? + 2)---(n? +n) 
a, =. 
" (n2 — 1)(n2 — 2)---(n2 —n) 
Then limy_+o9 dn = e, by exercise 55 in G. Pélya and G. Szegé, Problems and Theorems 
in Analysis, Springer-Verlag, 1972. Show that limp—oo n(adyn — e) = e. 


Solution by William F. Trench, Trinity University, San Antonio, TX. By Taylor’s Theorem 
applied to f(x) = log((1 + x)/(1 — x)), 


2 
f@) — 241 < Gap forO0 <x < 1. 


Since log a, = )0_) f (j/n) , we have 


rb) - als 


1 
n 


1 
no(1 — “Ta 2 yr o(33) 
as n — oo. Therefore 


1 1 1 1 
An = e exp ie rn ee 2)? 
-which implies that n(a, — e) = e+ O(1/n). 


Editorial comment. Several solvers obtained additional terms in the asymptotic expansion 
of log an and thus of a,. Douglas B. Tyler computed the former completely in terms of the 
Bernoulli numbers Bo, B;, Bo, ... as follows: 


Le. We A Boj-2 bane (. =) 
] =1+-]1 = ——_—_————_ [| 5: . |, +O 
Beer +e ea 2 Geogr 2i — 2; aN 
= j=! 
In a different direction, William A. Newcomb proved that the original conclusion holds for 
“Ue n+ n+ f(k/n) 
n— f(k/n)’ 
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where f is an arbitrary C2 function on [O, 1]. 


Solved also by Z. Ahmed & M.A. Prasad (India), J. Anglesio (France), G. L. Body (U. K.), P. Bracken (Canada), R. J. Chapman 
(U. K.), R. Cuculiere (France), J. Deutsch, K. P. Hart (The Netherlands), G. Keselman, J. H. Lindsey II, V. Lucic (Canada), 
W. A. Newcomb, M. Omarjee (France), K. Schilling, H.-J. Seiffert (Germany), P. Simeonov, N. C. Singer, I. Sofair, A. Stadler 
(Switzerland), A. Stenger, D. B. Tyler, J. H. van Lint (The Netherlands), J. Wimp, GCHQ Problems Group (U. K.), and the proposer. 


Harmonic Products of Harmonic Functions 


10651 [1998, 271]. Proposed by W. K. Hayman, Imperial College, London, U. K. If u; and 
uz are nonconstant real functions of two variables, and if uj, u2, and u,u2 are all harmonic 
in a simply connected plane domain D, prove that u2 = av, + b, where v; is a harmonic 
conjugate of uw; in D, and a and b are real constants. 


Solution by Tewodros Amdeberhan, DeVry Institute, North Brunswick, NJ. In R?, we write 
w, and wy for dw/dx and dw/dy. Let f = uj +ivy. Since f is analytic, f? is analytic, 
and hence 2u;v; = Im(f7) is harmonic. Since 


A(uju2) = Au, + Au2 +2Vuy, + Vuz and A (u,v1) = Au, + Avy + 2Vu, - Vv4, 


it follows from the hypotheses that both vectors Vuz and Vv, are orthogonal to Vu, in R?. 
Thus | 
Vu2 =aVri, (1) 


for some real function a = a(x, y). Consequently, Auz = a A v; + Va- Vuj, and so 
Vvj : (ax, dy) = 0. (2) 


Rewriting (1) in terms of components yields (uz), = a(v1)x and (u2)y = a(v})y. Differ- 
entiating with respect to y and x, respectively, we get 


(U2) xy = Ay(Vi)x +4(V1) xy and (u2)yx = a,(V1)y + a(v1)yx- 


This shows that 
Vv « (ay, —ay) = 0. (3) 


Combining (2) and (3) gives Va = 0, soa is a constant function. This in turn implies that 
V(u2 — av,) = Vu2 — aVv, = 0, proving that u2 — av, is a constant. 


Editorial comment. Irl C. Bivens notes that the “+ b” may be eliminated in the statement 
of the problem if we are allowed to choose which harmonic conjugate v, of u; is to be 
used. He also notes that “simply connected” is not needed in the statement, since the other 
conditions of the problem imply the existence of a harmonic conjugate. 


Solved also by K. F. Andersen (Canada), J. Anglesio (France), I. C. Bivens, R. J. Chapmen (U. K.), R. Govindaraj (India), M. 
Gruber, R. Mortini (France), I. Netuka (Czech Republic), D. E. Tepper & J. Huntley, W. F. Trench, E. I. Verriest, and the proposer. 


Large Values of Tangent 


10656 [1998, 366]. Proposed by David P. Bellamy and Felix Lazebnik, University of 
Delaware, Newark, DE, and Jeffrey Lagarias, AT&T Laboratories, Florham Park, NJ. 
(a) Show that there are infinitely many positive integers n such that | tann| > n. 
(b) Show that there are infinitely many positive integers n such that tann > n/4. 


Solution by Stephen M. Gagola, Jr., Kent State University, Kent, OH. We use the notation 
a = [ao; a), a2, ...] to represent the continued fraction expansion of the irrational number 
1 
a ema a 


he SS 
agz+... 
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The convergents h;/k; = [ao; a1, a2, ..., aj] have the property that the sequences {h;} and 
{k; } satisfy the same recurrence relation x; = a;x;_| + X;—2 (but with different initial con- 
ditions). From the theory of continued fractions (see for example Kumandure and Romero, 
Number Theory with Computer Applications, Prentice Hall, 1998; especially Chapters 11 
and 13), we have hj_1k; — hjki_,; = (—1)‘, h;/k2; increases to a, and hz;41/k2;41 de- 
creases to a. In particular, the interval whose endpoints are h;/kj and hj+,/k;+1 contains 
a and has length 1/(k;kj+1), so 

h; | 
a-—| < —— < ——_., 

Ki|  kikiat ~ Ri(ki +1) 
This can be improved (Proposition 13.1.9 of Kumandure and Romero): For any i, at least 
one of the convergents h;/k; and hj+1/kj41 satisfies |a —h/k| < 1/(2k?). 

When a = 1/2, we have 1/2 = [ao; a1, a2,...] = [1; 1, 1, 3, 31, 1, 145, ...], whose 

first few convergents are : 


ho 1 my 2 hp 3 fy lL hg 344 hs 355 


i>. 


ko 1? ky 1’ Ook)? Ooks.~SCi«i?'*”—iRASCi‘i ies S226” 
Claim 1. Ifi > 1, kj is odd, and aj4, > 2, then|tanh;| > h;. If in addition, i is even, 
then tanh; > h;. 


Proof of Claim 1. Write h/k for hj /k;. We have |w/2 — h/k| < 1/(kjkj41), and 7/2—h/k 
is positive when i is even. Therefore |k2/2 — h| < 1/kj+1, so 


|tanh| = |tan (kx /2 — (ka /2 — h))| = |cot (ka /2 — h)| > cot(1/kj+1) 
> kit, — 1/(2ki41) = aig iki + ki-1 — 1/(2ki41) > 2k = (2/(h/K))h = h, 


where we have used the estimate cot@ > (1/0) — (6/2), which is valid in the first quadrant. 
When i is even, the absolute value sign may be removed. UL 


Claim 2. [fi > 3 and bothk; and kj+, are odd, then |tanh| > h holds for at least one of 
the two convergents h/k € {hj /kj, his, /ki+1}. 


Proof of Claim 2. At least one of the convergents h/k € {hj/kj, hj+1/ki4+,1} satisfies 
| /2 —h/k| < 1/(2k*), and hence |k/2 — h| < 1/(2k). Estimating | tan h| as in the proof 
of Claim 1, |tanh| = |tan (ka /2 — (km /2 —h))| = |cot (kw/2 —h)| > cot(1/(2k)) > 
2k—1/(2- 2k) > 2k-1 = (2/ (h/k) — 1/h)h = (2/ 01/7) — 1/1) h = (13 /1DA > h, 
which is valid fori > 3. O 

(a) Let S = {i | kj and kj4, are odd} and T = {i | k; is odd and aj4, > 2}. The result 
follows from Claims 1 and 2 if we can show that S U T is an infinite set. In fact we prove 
that S U T meets every set of four consecutive positive integers. 

Fix a positive integer i. At least one of kj, kj, must be odd, and we replace i by i + 1, 

if necessary, so that kj is odd. If kj, is odd, then i € S, we are finished. Otherwise, kj+ 
is even, and then kj42 = aj42ki4) + kj is odd. Ifi +2 € T, we are finished, so assume 
i+2 g T. This implies Qj43 = 1, and then kj43 = Qj+3ki+2 + kis = ki+2 + Ki+t is odd. 
This last fact implies i +2 € S. 
(b) In view of Claim 1, we may assume that k2; is odd for only finitely many integers i. 
Then k2; is even and k2;+1 is odd for all sufficiently large i. Now k2j42 = a2j42k2j)41 + ki, 
and so a2; is even (and hence a2; > 2) for all sufficiently large i. For fixed large i, set 
h = hoj41 + hz and k = koj41 + k;. Then ho;/kai < h/k < haj42/koi42 < 1/2 < 
haj41/k2j+1. Since k is odd, 0 < 17/2 —h/k < hoj41/kaj4) —h/k = 1/(kkoj41). Since 
koi41 > k/2,we have 0 < 1/2—h/k < 2/k?,s00 <kn/2—h < 2/k. Therefore tanh = 
tan (ka /2 — (km /2—h)) = cot (km/2—h) > cot (2/k) > k/2 —1/k = (k-—1)/2 = 
((1/2)(k/h) — 1/(2h)) h. Since k/h is close to 2/m for large k, we have (1/2)(k/h) — 
1/(2h) ¥ 1/m > 1/4. 
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Editorial comment. Recent related problems from this MonTHLy include 10242 [1992, 675; 
1997, 271] and 10640 [1998, 62]. The proposers remark: “Presumably for each a > 0 
there exist infinitely many positive n such that tann > an. This would be true if 7/2 were 
a ‘random’ real number.” 


Solved also by J. Anglesio (France), R. Barbara (Lebanon), D. Callan, A. Stadler (Switzerland), A. Stenger, T. Trimble, C. Y. 
Yildirim (Turkey), SJSU Problems Ring, and the proposer. 


The Ellipse in a Paper Cup 


10664 [1998, 464]. Proposed by Vasile A. Mihai, Toronto, Canada. A paper cup in the 
shape of a right circular cone contains some water. Show that if one tips the cup at an angle 
0 without spilling the liquid, then the surface of the water describes an ellipse whose minor 
axis has length independent of 0. 


Solution by J. Schaer, University of Calgary, Calgary, Canada. Let the cone be given by 
z? = c(x? + y*) and the initial water level by z = h. In this position, the surface is a circle 
of radius b = h/./c, and the volume is V = Eb7h = ZDA, where A is the area of the 
“wet” triangle in the yz-plane. When the cone is tipped, the water surface is an ellipse with 
minor semiaxis b’ and volume V’. We wish to show that if V’ = V, then b’ = b. In this 
case the converse is equivalent: It suffices to show that if b’ = b, then V’ = V. Rather than 
tipping the cone, we may consider cutting it by planes that are parallel to the x-axis and 
produce an ellipse with minor semiaxis b. Since this minor axis is parallel to the x-axis, the 
endpoints of the minor axis lie in the planes x = +d, and their projections into the yz-plane 
form a hyperbola H with equation z? = c(b* + y*). The asymptotes of H are the lines of 
intersection of the cone with the yz-plane. The major axis of the boundary ellipse lies in 
the yz-plane, its endpoints lie on the asymptotes of H, and its midpoint lies on H. 


Proposition. A segment that touches a given hyperbola at its midpoint and ends on the 
asymptotes of the hyperbola 1s tangent to the hyperbola, and the triangles formed by the 
asymptotes and such segments all have the same area. 


Proof. The described property of hyperbolas is invariant under affine transformations, and 
all hyperbolas are affinely equivalent to the hyperbola with equation y = 1/x. So it suffices 
to show the property for y = 1/x. This is a simple calculation. UO 

Let h’ be the height of the tipped cone whose base is the ellipse and whose vertex 
is 0, and let a be the major semiaxis. The Proposition implies that the area A’ of the 
“wet” triangle is ah’ = A’ = A = bh. The volume of the tipped cone is therefore 
Vi = Zbal’ = FbA' = ZbA=V. 


Editorial comment. This problem appeared earlier in this Montuty: In volume 19 (1912), 
it was proposed and solved by C. N. Schmall. For a related property of cones (which can 
be used to solve this problem) the reader is referred to R. J. Bagby, Volumes of Cones, this 
Mon THLy 103 (1996) 794-796. 


Solved also by J. Anglesio (France), A. B. Ayoub, R. J. Bagby, M. Barra and C. Bernardi (Italy), M. Benedicty, G. D. Chakerian, 
R. J. Chapman (U. K.), J. Dou (Spain), J.-P. Grivaux (France), G. L. Isaacs, P. M. Jarvis and G. Atkins, W. Kim (South Korea), N. 
Lakshmanan, W. C. Lang, J. H. Lindsey II, J. Marengo, S. Metcalf, M. D. Meyerson, H. S. Morse, D. K. Nester, R. Patenaude, 
C. Popescu (Belgium), C. R. Pranesachar (India), C. Rosenkilde, A. Sasane (The Netherlands), L. Scribani (South Africa), P. 
Simeonov, W. R. Smythe, P. Szeptycki, L. Verriest, R. Voles (U. K.), Anchorage Math Solutions Group, Con Amore Problems 
Group (The Netherlands), GCHQ Problems Group (U. K.), and the proposer. 
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REVIEWS 


Edited by Harold P. Boas 
Mathematics Department, Texas A & M University, College Station, TX 77843-3368 


The Four-Color Theorem. By Rudolf Fritsch and Gerda Fritsch, translated from the 
German by J. Peschke. Springer-Verlag, 1998, xvi + 260 pp., $29.95. 


Reviewed by John A. Koch 


This book “has been written to explain the Four Color Theorem to a lay 
readership,” and, for the most part, it succeeds. The highest praise I can give such 
an effort is that I learned from it both bits of history and developments that have 
occurred since I was involved [2] in the solution of the problem in 1976. The book 
begins with a review of the historical foundations of the theorem and ends with a 
reference to a website [5] that displays the recent work of Robertson, Sanders, 
Seymour, and Thomas. 

The Four Color Theorem has generated interest among mathematicians and 
non-mathematicians alike: “The regions of every planar map can be colored using 
no more than four colors such that those regions that are adjacent have different 
colors.” Most amateur investigators immediately conjure up regions shaped like 
the spokes in a wheel. The requirement that adjacent regions touch at more than a 
single point is necessary for a meaningful theorem. 

The historical section begins with the origin of the theorem in an observation of 
Francis Guthrie, whose younger brother Frederick submitted the problem to his 
professor Augustus de Morgan in 1852. Alfred Kempe appeared to have solved the 
problem in 1879 when he published his paper in the American Journal of Mathe- 
matics Pure and Applied. It is an interesting sidelight how Kempe, a lawyer and an 
Englishman, came to submit to this American publication, at the time a “‘compara- 
tively insignificant” journal. In 1890, Percy Heawood identified an error in Kempe’s 
proof. However, Kempe’s arguments do yield a relatively simple proof of the Five 
Color Theorem. 

The Fritsches particularly highlight the German connection to the theorem. The 
important efforts of Heinrich Heesch and Karl Durre led to Wolfgang Haken’s 
involvement, and the interplay between these three and Ken Appel resulted in the 
unavoidable sets being winnowed down from one million elements to fewer than 
2000. Most of the researchers in the Four Color field were aware of what the 
others were doing; I recall Appel relating that he and Haken stopped work on 
their approach in 1970 to investigate Shimamoto’s supposed proof. 

To prove the Four Color Theorem, one first translates it into an equivalent 
problem about graphs. The proof then breaks down into two major components: 
first the generation of an unavoidable set of configurations, and then the demon- 
stration that no element of the unavoidable set can be in a minimal counter- 
example to the theorem. 

One unavoidable configuration can easily be derived from Euler’s formula 
relating the number of faces f, vertices v, and edges e of a graph: 


v—-et+f=2. (1) 
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Since each edge borders two faces, and each face is surrounded by at least three 
edges, it follows that 3f < 2e, so that 


f <2e/3. (2) 
This inequality together with (1) yields v —e+ 2e/3 = 2, which implies 
3v —e > 6, or ee Gipere (3) 


Consequently, there must be a vertex with degree less than or equal to 5 in a 
connected, planar graph with no self-loops. Indeed, suppose that all the vertices of 
a graph had degree greater than 5. Adding the degrees of the vertices would show 
that 2e > 6v, or e > 3v, which would contradict (3). Thus, there must be a vertex 
of degree 1, 2, 3, 4, or 5 in any planar graph with no self-loops: this is the 
unavoidable set that Kempe used in his failed proof of the Four Color Theorem. 

If there is a counterexample to the Four Color Theorem, then there is one with 
a minimal number of vertices, obviously at least five. The second part of the proof 
is to show that every element (called a configuration) of the unavoidable set is 
reducible, that is, cannot be in a minimal counterexample to the theorem. 

Kempe attempted to show that a degree 5 vertex is reducible by using a process 
that became known as “Kempe chaining.” The flaw Heawood noted was that 
Kempe changed the colors of two chains simultaneously. 

The process of showing that a configuration f is reducible begins with assuming 
that f is embedded in a minimal counterexample to the Four Color Theorem. One 
removes f, yielding a smaller graph. Since the original graph was assumed to be a 
minimal counterexample, the smaller graph can be colored with four colors. Now 
replace f in the graph and try to extend the existing coloration of the ring 
surrounding f into the interior vertices. If this can be done for an arbitrary 
coloration of the ring, then f is called A-reducible. 

Other types of reducible configurations allow one to examine fewer ring 
colorations: B-reductions involve merging ring vertices (thus causing their colors to 
be the same) or adding edges between ring vertices (thus causing their colors to be 
different), while C and D reductions involve replacing the original configuration 
with a configuration containing fewer vertices (so that the whole graph can be four 
colored) and examining the resulting possible ring colorations. Such reducers 
decrease the total possible number of ring colorations that must be examined. This 
becomes critical when one considers the combinatorial explosion in possible 
unique ring colorations: 


ring size colorations 
10 2,461 
11 7,381 
12 22,144 
13 66,430 
14 199,291 
15 597,872 


After their historical discussion, the Fritsches begin with topological maps in 
Chapter 2. At the start of a section that proves lemmas concerning simple curves 
and the Jordan curve theorem, they state: “It must, however, be emphasized that 
many seemingly self-evident statements and theorems are sometimes difficult to 
prove rigorously.” Chapter 3 provides the topological version of the Four Color 
Theorem. The terms regular map, vertex degree, circuit, and border vertex are 
‘defined, and lemmas are proved about the amusingly named “minimal criminal,” 
which is a postulated minimal-counterexample to the Four Color Theorem. 
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The authors take the combinatorial approach in Chapter 4, where they prove 
the duality of maps and graphs. The usual transformation is to consider a point as 
the capital of a region. These capital points (vertices) are joined by lines to capitals 
in adjacent countries. Thus, the problem becomes to color the vertices of a planar 
graph in such a way that adjacent vertices have different colors. This is the 
formulation of the problem that Appel, Haken, and I worked with most closely. 
The authors prove the Five Color Theorem in this chapter. 

Chapter 5 discusses the combinatorics of the graphical version of the theorem. 
At the end of the chapter, the authors give the necessary definitions of reducible 
configuration and unavoidable set. Up to this point, they have proved most of the 
lemmas. In the remaining 70 pages, they describe the four types of reductions 
(A, B, C, and D) with detailed examples. They even list Durre’s program written 
in Algol, with German comments. 

The final 11 pages discuss general principles involved in the massive process of 
determining the unavoidable set. This process is described through obstructions in 
configurations, some “rules of thumb,” and “geographical goodness.” 

An interesting aspect of the proof is that there is not a single unique unavoid- 
able set. In fact, as the original proof developed, certain configurations that were 
found too difficult to reduce were replaced by others. The unavoidable set in the 
original paper consists of 1476 configurations. The proof of Robertson, Sanders, 
Seymour, and Thomas [4] uses 633 configurations, and it trims the number of 
discharging rules from more than 300 to only 32. Despite the improvement in the 
proof, it has not been reduced to a simple enough process to satisfy all mathemati- 
cians (or even all non-mathematicians). The proof still involves enough computer 
calculation that one cannot verify the result by hand. 

The possibility of an error in the computer programs troubles some people. 
However, there are several parameters that come out of the reduction process, and 
others who have written programs to reduce configurations have achieved the 
same parameters for the same configurations. The situation is analogous to solving 
a riddle: once you know the answer, it seems trivial; but to find the solution may 
involve many exhaustive trials. 

This book would be excellent for college students involved in topics courses or 
senior projects. The beginning basics are described in detail. Although there is a 
definite lack of information about the discharging procedures used to develop the 
unavoidable set, there is a useful bibliography and enough leads to keep good 
mathematics students busy. 
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Reference, P. Encyclopedia of Statistical Sci- 
ences, Update Volume 3. Ed: Samuel Kotz, et 
al. Wiley, 1999, xvii + 898 pp, $215. [ISBN 
0-47 1-23883-X] 


Education, P, L. Making Change in Mathemat- 
ics Education: Learning from the Field. Eds: 
Joan Ferrini-Mundy, et al. NCTM, 1998, xii 
+ 148 pp, $10.95 (P). [ISBN 0-87353-442- 
5] Profiles of pioneers who embarked early 
on implementation of the 1989 NCTM Stan- 
dards based on a qualitative research study 
undertaken by NCTM at seventeen different 
sites. Some conclusions: change is challenging; 
teachers need space to experiment; no single 
approach fits all circumstances. Unfortunately, 
many teachers miss the mathematical point of 
the Standards, so some results are more carica- 
tures than implementation. LAS 


Education, P*, L*. Developing Mathemati- 
cally Promising Students. Ed: Linda Jensen 
Sheffield. _ NCTM, 1999, xii + 316 pp, 
$37.50 (P). [ISBN 0-87353-470-0] A rich 
and varied resource for teachers struggling to 
challenge the ablest students even as they pro- 
vide a common core curriculum for all students. 
34 papers on identification of promising stu- 
dents (are others thus “unpromising?’’), on de- 
veloping cultures of mathematics and oppor- 
tunity, and on examples of programs and ap- 
proaches that work. LAS 


Education, P. Challenges in the Mathematics 
Education of African American Children. Eds: 
Carol E. Malloy, Laura Brader-Araje. NCTM, 
1998, ix + 85 pp, $9.95 (P). [ISBN 0-87353- 
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458-1] Proceedings of the 1997 Benjamin 
Banneker Association Leadership Conference. 


Education, P. Elementary School Mathemat- 
ics: What Parents Should Know About Problem 
Solving/Estimation, Second Edition. Barbara J. 
Reys. NCTM, 1999, $6 (P). [ISBN 0-87353- 
467-0] 


Education, P. Changing the Faces of Math- 
ematics: Perspectives on Latinos. Eds: Walter 
G. Secada, et al. NCTM, 1999, viii + 168 pp, 
$16 (P). [ISBN 0-87353-464-6] First in a 
planned series of six volumes focused on issues 
related to the teaching and learning of math- 
ematics among ethnic minorities and women. 
Five sections: Socioeducational Issues; Lan- 
guage Issues; Teaching-Learning Aids; Staff 
Development; Intervention Programs. 


Logic, T(15-17: 1), L. A Set Theory Workbook. 
Iain T. Adamson. Birkhauser Boston, 1998, viii 
+ 154 pp, $29.50 (P). [ISBN 0-8176-4028-2] 
Text for an introductory set theory course us- 
ing the Moore method. Presents definitions and 
notation; 155 exercises cover examples and the- 
orems. Includes hints and complete solutions 
for exercises. Follows Von Neumann—Bernays— 
Gédel approach. KES 


Logic, T*(17: 1, 2), L. Logic of Mathe- 
matics: A Modern Course of Classical Logic. 
Zofia Adamowicz, Pawel Zbierski. Pure & 
Appl. Math. Wiley, 1997, viii + 260 pp, 
$59.95. [ISBN 0-471-06026-7] Concise and 
clear. Part I contains introduction to logic and 
model theory; emphasizes relational structures. 
Part II covers Gédel’s incompleteness theorems, 
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Tarski’s theorem on real closed fields, Matiya- 
sevich’s theorem on diophantine relations. KES 


Logic, P.  Provability, Complexity, Gram- 
mars. Lev Beklemishev, Mati Pentus, Niko- 
lai Vereshchagin. AMS Transl. Ser. 2, V. 192. 
AMS, 1999, ix + 172 pp. [ISBN 0-8218- 
1078-2] | Three award-winning dissertations in 
mathematical logic, mathematical linguistics, 
and complexity theory. 


_ Foundations, T(13—-14: 1), S, L. Elements 
of Logic via Numbers and Sets. D.L. John- 
son. Springer-Verlag, 1998, x + 174 pp, 
$29.95 (P). [ISBN 3-540-76123-3] Concise, 
sophisticated text for transition course. (For 
example, the definition of a field is given on 
page 4.) Covers types of proof, truth tables, 
quantifiers, sets, relations, functions, cardinal 
numbers. Few drill exercises; provides com- 
plete solutions for most exercises. KES 


Discrete Mathematics, T(17-18: 1), P. Topics 
in Intersection Graph Theory. Terry A. McKee, 
FR. McMorris. SIAM, 1999, viii + 205 pp, 
$55 (P). [ISBN 0-89871-430-3] The inter- 
section graph of a family of sets is the graph 
in which the sets are the vertices, and two ver- 
tices are adjacent if the corresponding sets have 
nonempty intersection. Covers theory and tech- 
niques common to various types of intersection 
graphs. Emphasizes chordal, interval, and com- 
petition graphs. Includes guide to literature for 
related topics, exercises, extensive bibliogra- 
phy, applications. KES 


Algebra, P. The Classification of the Finite 
Simple Groups, Number 4, Part II, Chapters I- 
4: Uniqueness Theorems. Daniel Gorenstein, 
Richard Lyons, Ronald Solomon. Math. Surv. 
& Mono., V. 40, No. 4. AMS, 1999, xv + 
341 pp, $75. [ISBN 0-8218-1379-X] 

Algebra, T(15-16: 1, 2), L. Algebra: Ab- 
stract and Concrete. Frederick M. Goodman. 
Prentice-Hall, 1998, xv + 335 pp. [ISBN 0- 
13-283988-1] Text for first or second abstract 
algebra course. Presents groups first. In- 
cludes group actions, field extensions, solvabil- 
ity, isometry groups. Emphasis on symmetry. 
Prerequisite: first course in linear algebra. KES 


Real Analysis, T(16—17: 1, 2). Principles of 
Real Analysis, Third Edition. Charalambos D. 
Aliprantis, Owen Burkinshaw. Academic Pr, 
1998, x +415 pp. [ISBN 0-12-050257-7] Be- 
sides the basics, covers measurability, Lebesgue 
integral, Stone—Weierstrass theorem, normed 
spaces, L,-spaces, Hilbert spaces, and Fourier 
analysis (new in this edition). (Second Edition, 
TR, December 1990.) SN 


Real Analysis, S(16-17). Problems in Real 
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Analysis, Second Edition: A Workbook with 
Solutions. Charalambos D. Aliprantis, Owen 
Burkinshaw. Academic Pr, 1999, vii + 403 pp. 
[ISBN 0-12-050257-7] Detailed solutions to 
all 609 problems in the authors’ Principles of 
Real Analysis, Third Edition. (See preceding 
review.) SN 


Real Analysis, T(17: 2, 3), L. Real Anal- 
ysis: Modern Techniques and Their Applica- 
tions, Second Edition. Gerald B. Folland. Pure 
& Appl. Math. Wiley, 1999, xiv + 386 pp, 
$74.95. [ISBN 0-471-31716-0] Second Edi- 
tion of popular text. New features include: ex- 
panded sections on n-dimensional Lebesgue in- 
tegral, Fourier analysis and distributions; added 
material on self-similarity and Hausdorff di- 
mension; new proof of Tychonoff’s theorem. 
(First Edition, TR, May 1985.) KS 


Complex Analysis, T*(16: 1, 2), L. Complex 
Variables. M. Ya. Antimirov, A.A. Kolyshkin, 
Rémi Vaillancourt. Academic Pr, 1998, xii 
+ 476 pp. [ISBN 0-12-059545-1] A viable 
choice for a first course. More formal than 
some, it has a huge collection of problems. Not 
much on mappings, but includes an interesting 
application of residue theory to the evaluation 
of infinite sums. Worth a look. TAV 


Complex Analysis, P. Positivity in Com- 
plex Spaces and Plurisubharmonic Functions. 
Pierre Lelong. Papers in Pure & Appl. Math., 
V. 112. Queen’s Univ, 1998, x + 243 pp, 
(P). [ISBN 088911-828-0] Collects 10 of Le- 
long’s previously published papers. 


Partial Differential Equations, T(15: 1), L. 
Boundary Value Problems, Fourth Edition. 
David L. Powers. Academic Pr, 1999, xi 
+ 528 pp. [ISBN 0-12-563734-9] Major 
changes since the Third Edition (TR, June-July 
1987): new sections on applications of Legen- 
dre polynomials and the error function; almost 
100 new exercises. PG 


Numerical Analysis, P, C. Proceedings of the 
Ninth SIAM Conference on Parallel Processing 
for Scientific Computing 1999. SIAM Activ- 
ity Group on Supercomputing, 1999, CD-ROM. 
[ISBN 0-89871-435-4] 


Functional Analysis, P. Function Spaces. Ed: 
Krzysztof Jarosz. Contemp. Math., V. 232. 
AMS, 1999, xvii + 361 pp, $81 (P). [ISBN 
Q-8218-0939-3] Proceedings of a conference 
held at Southern Illinois University in 1998. 


Analysis, P. Positive Solutions of Differen- 
tial, Difference and Integral Equations. Ravi P. 
Agarwal, Donal O’Regan, Patricia J.Y. Wong. 
Kluwer Academic, 1999, xi + 417 pp, $210. 
[ISBN 0-7923-5510-5] 
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Analysis, P. Handbook of Splines. Gheorghe 
Micula, Sanda Micula. Math. & Its Applic., 
V. 462. Kluwer Academic, 1999, xvi + 604 pp. 
[ISBN 0-7923-5503-2] An up-to-date survey 
of the theory of spline functions and some of 
their applications. Includes an extensive bibli- 
ography (nearly 120 pages’). 

Differential Geometry, P. Moscow Seminar 
in Mathematical Physics. Eds: A. Yu. Moro- 
zov, M.A. Olshanetsky. AMS Transl. Ser. 2, 
V. 191. AMS, 1999, x + 299 pp, $110. 
[ISBN 0-8218-1388-9] 9 papers based on 
talks given at the Moscow Institute of Theoreti- 
cal and Experimental Physics. “The articles are 
mainly devoted to various aspects of Knizhnik— 
Zamolodchikov—Bernard connections and in- 
tegrable models in two-dimensional quantum 
field theory.” 


Differential Geometry, P. Symmetries and 
Conservation Laws for Differential Equa- 
tions of Mathematical Physics. Eds: LS. 
Krasil’shchik, A.M. Vinogradov. Transl. of 
Math. Mono., V. 182. AMS, 1999, xiv + 
333 pp, $129. [ISBN 0-8218-0958-X] Rig- 
orous mathematics and concrete examples il- 
lustrate the geometric approach to the study of 
nonlinear PDEs. 


Differential Geometry, P. Differential and 
Symplectic Topology of Knots and Curves. Ed: 
S. Tabachnikov. AMS Transl., Ser. 2, V. 190. 
AMS, 1999, x + 286 pp, $99. [ISBN 0-8218- 
1354-4] 


Differential Geometry, P. The Topology of Fi- 
bre Bundles. Norman Steenrod. Landmarks in 
Math. & Physics. Princeton Univ Pr, 1999, viii 
+ 229 pp, $19.95 (P). [ISBN 0-691-00548-6] 
Paperback republication of the 1951 edition. 


Geometry, T(15-16), S, L. Conics and Cu- 
bics: A Concrete Introduction to Algebraic 
Curves. Robert Bix. Undergrad. Texts in Math. 
Springer-Verlag, 1998, x + 289 pp, $49.95. 
[ISBN 0-387-98401-1] A clever book on the 
algebraic geometry of curves of degree at most 
3. Could be used as the textbook for a geometry 
course for mathematics majors or for a sequel to 
the usual college geometry course for prospec- 
tive secondary teachers. PF 


Algebraic Topology, P. Conjugacy Classes 
in Gauge Groups. Renzo A. Piccinini, Mauro 
Spreafico. Papers in Pure & Appl. Math., 
V. 111. Queen’s Univ, 1998, 138 pp, (P). [ISBN 
0889 1 1-826-4] 


Algebraic Topology, S(16), L. Algebraic 
Topology: An Intuitive Approach. Hajime Sato. 
Transl: Kiki Hudson. Transl. of Math. Mono., 
V. 183. AMS, 1999, xviii + 118 pp, $20 (P). 
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(ISBN 0-8218-1046-4] This translation of a 
1996 Japanese monograph provides a gentle in- 
troduction to algebraic topology. The author’s 
claim that no previous knowledge of mathe- 
matics is necessary is overstated, but the text 
is reader-friendly (e.g., it concentrates on sim- 
ple examples at the expense of generalizations). 
Very few exercises. A nice supplement for a 
topology course. JD 


Topology, P. Tel Aviv Topology Conference: 
Rothenberg Festschrift. Eds: Michael Farber, 
Wolfgang Ltick, Shmuel Weinberger. Contemp. 
Math., V. 231. AMS, 1999, ix +320 pp, $71 (P). 
[ISBN 0-8218-1362-5] Papers from a 1998 
conference at Tel Aviv University. 


Topology, P. Aspects of Ultrametric Spaces. 
Ulrich Heckmanns. Papers in Pure & Appl. 
Math., V. 109. Queen’s Univ, 1998, iv + 134 pp, 
(P). [ISBN 0-8891 1-822-1] 


Topology, T(18), S, P.. Hyperspaces: Fun- 
damentals and Recent Advances. Alejandro 
Illanes, Sam B. Nadler, Jr. Pure & Appl. 
Math., V. 216. Dekker, 1999, xvii + 512 pp, 
$175. [ISBN 0-8247-1982-4] First few chap- 
ters present the basics of hyperspaces. Re- 
maining chapters detail developments in the 20 
years since the second author’s Hyperspaces 
of Sets appeared, especially Whitney proper- 
ties and Whitney-reversible properties. Many 
examples, exercises, references, and research 
problems. JD 


Topology, P. Surgery on Compact Manifolds, 
Second Edition. C.T.C. Wall. Ed: A.A. Ranicki. 
Math. Surv. & Mono., V. 69. AMS, 1999, xv + 
302 pp, $59. [ISBN 0-8218-0942-3] This edi- 
tion supplements the 30-year-old original with 
notes, footnotes, updated references, and some 
corrections. (1971 Academic Press First Edi- 
tion, TR, June-July 1971.) JD 


Optimization, P. A Reformulation-Linearization 
Technique for Solving Discrete and Continuous 
Nonconvex Problems. Hanif D. Sherali, Warren 
P. Adams. Nonconvex Optim. & Its Applic., 
V. 31. Kluwer Academic, 1999, xxi11+ 514 pp, 
$252. [ISBN 0-7923-5487-7] 


Optimization, P. Nonlinear Programming and 
Variational Inequality Problems: A Unified Ap- 
proach. Michael Patriksson. Appl. Optim., 
V. 23. Kluwer Academic, 1999, xiv + 334 pp. 
[ISBN 0-7923-5455-9] 


Stochastic Processes, T, P, L. Stochastic Pro- 
cesses for Insurance and Finance. Tomasz Rol- 
ski, et al. Ser. in Prob. & Stat. Wiley, 1999, 
XViil + 654 pp, $175. [ISBN 0-471-95925-1] 
An interesting treatment of all the usual topics 
in a stochastic processes course, from Markov 
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chains to continuous time martingales, in the 
context of actuarial and financial considerations 
of an insurance firm. Written as a text, the price 
is a hurdle. TAV 


Stochastic Processes, P*. Introduction to Ma- 
trix Analytic Methods in Stochastic Modeling. 
G. Latouche, V. Ramaswami. Ser. on Stat. & 
Appl. Prob. SIAM and American Statistical 
Assoc, 1999, xiv + 334 pp, $49.50 (P). [ISBN 
0-89871-425-7] The matrix analytic method 
is an important modeling technique because of 
its wide applicability and numerical tractability. 
A somewhat daunting treatment, but a valuable 
addition to the professional literature. Assumes 
reader has background in advanced calculus, 
linear algebra, and stochastic processes. TAV 


Stochastic Processes, T*(17: 2), P. Stochas- 
tic Dynamic Programming and the Control of 
Queueing Systems. Linn I. Sennott. Ser. in 
Prob. & Stat. Wiley, 1999, xiv + 328 pp, 
$79.95. [ISBN 0-471-16120-9] A powerful 
treatment of stochastic methods in dynamic pro- 
gramming. Each chapter has numerous chal- 
lenging problems and a detailed bibliography. 
Assumes a first course in stochastic processes, 
specifically Markov chains. TAV 


Stochastic Processes, P. Gaussian Measures. 
Vladimir I. Bogachev. Math. Surv. & Mono., 
V. 62. AMS, 1998, xii + 433 pp, $95. [ISBN 
0-8218-1054-5] From the Preface: ““The mod- 
ern theory of Gaussian measures lies at the in- 
tersection of the theory of random processes, 
functional analysis, and mathematical physics, 
and is closely connected with diverse applica- 
tions in quantum field theory, statistical physics, 
financial mathematics, ... .” A rich intersec- 
tion, indeed. A deep and detailed treatment of 
an important subject. TAV 


Elementary Statistics, T(14: 1). Statistical 
Reasoning and Methods. Richard A. Johnson, 
Kam-Wah Tsui. Wiley, 1998, xiv + 589 pp, 
$86.95. [ISBN 0-471-04205-6] Mathemati- 
cal level is elementary; stresses reasoning and 
intuition. Emphasis on quality of data. Includes 
suggested class projects, exercises, MINITAB 
commands and output. HS 


Statistical Methods, T(16-17: 1). Time Series 
Models for Business and Economic Forecast- 
ing. Philip Hans Franses. Cambridge Univ 
Pr, 1998, x + 280 pp, $69.95; $24.95 (P). 
[ISBN 0-521-58404-3; 0-521-58641-0] Fo- 
cuses on methodology and applications rather 
than theory. Assumes knowledge of introduc- 
tory econometrics. No exercises. HS 


Statistical Methods, P. The Design and Anal- 
ysis of Clinical Experiments. Joseph L. Fleiss. 
Wiley Classics Library. Wiley, 1999, xiv + 
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[ISBN 0-471-34991-7] 
Paperback republication of the 1986 original. 


Statistical Methods, P. Cognition and Survey 
Research. Eds: Monroe G. Sirken, et al. Ser. in 
Prob. & Stat. Wiley, 1999, xiv + 395 pp, $89.95. 
[ISBN 0-471-24138-5] From the Preface: “It 
[this book] offers a review of the early work in 
cognition and survey research, an update on the 
initiatives currently underway, and a glimpse 
into the future of interdisciplinary work on sur- 
vey methods.” 


Statistics, P. Multivariate Statistical Inference. 
Ed: Edward J. Dudewicz. Amer. J. of Math. & 
Manag. Sci., V. 18, Nos. 1 & 2. American 
Sciences Pr, 1998, 238 pp, $148 (P). [ISBN 
0-935950-41-9] Volume 4 of the proceedings 
of the Multivariate Statistical Inference 2000 
Conference held in 1995 at the University of 
Hawaii. 

Applications (Economics), P. Advances in De- 
cision Analysis. Eds: Nadine Meskens, Marc 
Roubens. Math. Modelling: Theory & Applic., 
V. 4. Kluwer Academic, 1999, ix + 202 pp, 
$99. [ISBN 0-7923-5563-6] 10 papers from 
the 1997 International Conference on Methods 
and Applications of Multiple Criteria Decision 
Making held in Mons, Belgium. 

Applications (Engineering), T(15: 1), L. 
Wavelets. Joran Bergh, Fredrik Ekstedt, Martin 
Lindberg. Studentlitteratur, 1999, vii + 210 pp, 
SEK 355 (P). [ISBN 91-44-00938-0] Signal 
processing approach to wavelets. First half is 
devoted to theory, second half to applications. 
Begins with a review of signal processing and 
filter banks, then discusses multiresolution anal- 
ysis and wavelets in several dimensions. PG 


Applications (Fluid Mechanics), P. Annual 
Review of Fluid Mechanics, Volume 31. Eds: 
John L. Lumley, Milton Van Dyke, Helen L. 
Reed. Annual Reviews, 1999, xi + 650 pp, $60. 
[ISBN 0-8243-073 1-3] 


Applications (Statistical Mechanics), P. Sta- 
tistical Green’s Functions. V.1. Yukalov. Papers 
in Pure & Appl. Math., V. 110. Queen’s Univ, 
1998, iv + 130 pp, (P). [ISBN 08891 1-824-8] 
Applications (Systems Theory), P. Gener- 
alized Riccati Theory and Robust Control: A 
Popov Function Approach. Vlad Yonescu, Cris- 
tian Oara, Martin Weiss. Wiley, 1999, xxi + 
380 pp, $125. [ISBN 0-471-97147-2] 


Reviewers 


JD: Jill Dietz, St. Olaf; PF: Paul Froeschl, Macalester; PG: 
Philip Gloor, St. Olaf; SN: Sam Northshield, Carleton; KS: 
Karen Saxe, Macalester; HS: Heidi Shierholz, St. Olaf; 
KES: Kay E. Smith, St. Olaf; LAS: Lynn Arthur Steen, 
St. Olaf; TAV: Theodore A. Vessey, St. Olaf. 


791 


THE OHIO STATE UNIVERSITY 


The Department of Mathematics at Ohio State offers several options for graduate 
study: a broad Ph.D. degree program with specialization in nearly all branches of 
contemporary mathematics, pure and applied; an M.S. degree program with an 
elective focus in pure or applied mathematics; and a dual M.S. degree program with 
Computer Science. 


Students are encouraged to begin their studies at Ohio State in June. Summer 
Fellowships providing stipends, in addition to a waiver of all tuition and fees, are 
anticipated. Academic-year financial support is available in the form of Teaching 
Associateships and Fellowships with stipends ranging, approximately, from 
$13,500 to $15,000, in addition to a waiver of all tuition and fees. Summer support 
in the form of TA, RA and Fellowships is also available for almost 90% of 
continuing students. 


Further information and application materials are available from: 


THE OHIO STATE UNIVERSITY 
Department of Mathematics 
231 W. 18th Avenue 
Columbus, Ohio 43210-1174 
(614)292-6274 


e-mail: bonace @math.ohio-state.edu 
web site: http://www.math.ohio-state.edu 
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Leads students quickly to the key ideas of combinatorics in a logical and proactive way . . . 


This book teaches the art of enumeration, or counting, by leading the reader 


Combinatorics 
through a series of carfully chosen problems that are arranged strategically 


t+ A PROBLEM 


Orexen to introduce concepts in a logical order and in a provocative way. 
APPROACH i 


The format is unique in that it combines features of a traditional textbook 
with those of a problem book. It is organized in eight sections, the first four 
of which cover the basic combinatorial entities of strings, combinations, dis- 
tributions and partitions. The last four cover the special counting methods of 
inclusion and exclusion, recurrence relations, generating functions, and the 
method of Polya and Redfield that can be characterized as “counting modulo 
symmetry.” The subject matter is presented through a series of approximately 
250 problems with connecting text where appropriate, and is supplemented 
by approximately 220 additional problems for homework assignments. Many 
applications to probability are included throughout the book. 


° ° While intended primarily for use as a text for a college-level course taken by 
Combinatorics mathematics, computer science and engineering students, the book is suitable as 
A Problem Oriented well for a general education course at a good liberal arts college, or for self-study, 


Approach Catalog Code: CMB/JR 
156 pp., Paperbound, 1998, ISBN 0-88385-708-1 


Textbook 


Daniel Marcus) List: $28.00 MAA Member: $22.50 
Series: Classroom Resource Materials Solutions manual available with adoption orders. 


Phone in Your Order Now! 1-800-331-1622 | 


Monday — Friday 8:30 am — 5:00 pm FAX (301) 206-9789 
or mail to: The Mathematical Association of America, PO Box 91112, Washington, DC 20090-1112 
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: Updated Edutian 


All prices subject to change. Charges for delivery are $3.00 per order. For optional air delivery outside of the : 
continental U. S., please include $6.50 per item. Prepayment required. Order from: American Mathematical Ry 
Society, P. O. Box 5904, Boston, MA 02206-5904, USA. For credit card orders, fax 1-401-455-4046 or call toll BN 
free 1-800-321-4AMS (4267) in the U. S. and Canada, 1-401-455-4000 worldwide. Or place your order through 


AMERICAN MATHEMATICAL SOCIETY 


Starting Our Careers 


A Collection of Essays and Advice on 
Professional Development from the 
Young Mathematicians’ Network 


Curtis D. Bennett, Bowling Green State 
University, OH, and Annalisa Crannell, Franklin 
& Marshall College, Lancaster, PA, Editors 


If you are the reader we envision for this book, you 
have just passed through the most crucial stage of 
your career—writing and defending your doctoral 
thesis in mathematics—only to discover what lies 
ahead is, yet again, the most crucial stage of your 
career: making the choice about what job to take ... 


—from the Introduction 


This “how-to” book addresses all aspects of a young 
mathematician’s early career development: How do 
| get good letters of recommendation? How do | 
apply for a grant? How do | do research in a small 
department that has no one in my field? How do | do 
anything meaningful if all | can get is a series of 


one-year jobs? 


These articles paint a broad portrait of current pro- 
fessional development issues of interest from the 
Young Mathematician’s Network—from finding jobs 
to organizing special sessions. There are chapters 
on applying for positions, working in industry and in 
academia, starting and publishing research, writing 
grant proposals, applying for tenure, and becoming 
involved in the academic community. The book offers 
timely and sound advice offered by recent doctorates 
through experienced mathematicians. The material 
originally appeared in the electronic pages of Con- 
cerns of Young Mathematicians. The book is devoted 


exclusively to the early stages of a mathematical career. 


1999; 116 pages; Softcover; ISBN 0-8218-1543-1; List $24; 
Individual member $14; Order code SOCMM910 


Assistantships and Graduate 
Fellowships in the Mathematical 
Sciences, 1999-2000 


Review of the previous annual edition: 


What makes this directory unusual is the additional infor- 
mation provided about the department. The AMS has 
provided for each department the number of tenured 
faculty that have published within the last three years and 
a breakdown of the financial support available to graduate 
students as well as the kind of work required to obtain 
support. From a student's point of view, these additional 
data are vital in the selection process. The AMS has pro- 
vided a valuable aid to students in the mathematical 
sciences. This guide is highly recommended for any aca- 
demic institution with an undergraduate mathematics 


major. : 
y —American Reference Books Annual 


the AMS bookstore at www.ams.org/bookstore/. Residents of Canada, please include 7% GST. 


Assistantships and Graduate Fellowships brings together 
a wealth of information about resources available for 
graduate study in mathematical sciences departments in 
the U.S. and Canada. Information on the number of 
faculty, graduate students, and degrees awarded (bache- 
lor’s, master’s, and doctoral) is listed for each department 
when available. Stipend amounts and the number of 
awards available are given, as well as information about 
foreign language requirements. 


Also listed are sources of support for graduate study and 
travel, summer internships, and graduate study in the U.S. 
for foreign nationals. Finally, a list of reference publications 
for fellowship information makes Assistantships and Gradu- 
ate Fellowships a centralized and comprehensive resource. 
1999; approximately 130 pages; Softcover; ISBN 0-8218- 
2011-7; List $20; Individual member $12; Order code 
ASST/99MM910 


Combined Membership List 
1999-2000 


The Combined Membership List (CML) is a comprehensive 
directory of the membership of the American Mathematical 
Society, the American Mathematical Association of Two- 
Year Colleges, the Mathematical Association of America, 
and the Society for Industrial and Applied Mathematics. 


There are two lists of individual members. The first is a 
complete alphabetical list of all members in all four organi- 
zations. For each member, the CML provides his or her 
address, title, department, institution, telephone number (if 
available), and electronic address (if indicated), and also 
indicates membership in the four participating societies. 
The second is a list of individual members according to 
their geographic locations. In addition, the CML lists acad- 
emic, institutional, and corporate members of the four 
participating societies providing addresses and telephone 
numbers of mathematical sciences departments. 

1999; approximately 376 pages; Softcover; ISBN 0-8218- 
1997-6; List $62; Individual member $37; Order code 
CML/1999/2000MM910 


Mathematics into Type 
Updated Edition 


Ellen Swanson, Director of AMS Editorial Services 
(Retired) 


This edition, updated by Arlene O’Sean and Antoinette 
Schleyer of the American Mathematical Society, brings 
Ms. Swanson’s work up to date, reflecting the more tech- 
nical reality of publishing today. While it includes 
information for copy editors, proofreaders, and production 
Staff to do a thorough, traditional copyediting and proof- 
reading of a manuscript and proof copy, it is increasingly 
more useful to authors, who have become intricately 
involved with the typesetting of their manuscripts. 

1999; 102 pages; Softcover; ISBN 0-8218-1961-5; List $24; 
Individual member $14; Order code MIT/2MM910 
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FRIVE Is the trusted 

mathematical assistant relied 

upon by students, educators, 
engineers, and scientists around 
the world. It does for algebra, 
equations, trigonometry, vectors, 
matrices, and calculus what the 
scientific calculator does for 
numbers — it eliminates the 
drudgery of performing long and 
tedious mathematical 
calculations. You can easily 
solve both symbolic and numeric 
problems and see the results plotted as 
2D or 3D graphs. 


For everyday mathematical work 
DERIVE \s a tireless, powerful, and 
knowledgeable assistant. For teaching 
or learning mathematics, DERIVE gives 


Soft Warchouse: 


HONOLULU*HAWAI! 


© 1996 Soft Warehouse, Inc. DERIVE is a registered trademark of Soft Warehouse, 
Inc. Other trademarks are the property of their respective owners. 


Site Licenses and Student Pricing. 
_ See www.derive.com 
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better and more quickly than by 
using traditional methods. 


System Requirements: 
Windows 95, 3.1x or NT running 
on a computer with 8 megabytes 
of memory. 


Suggested Retail Price: $250. 
Educational pricing available. 


For product information and list of 
dealers, fax, email, write, or call Soft 
Warehouse, Inc. or visit our website at 
http://www.derive.com. 


The Easiest just got Easier. 


Soft Warehouse, Inc. * 3660 Waialae Avenue 
Suite 304 * Honolulu, Hawaii, USA 96816-3259 
Telephone: (808) 734-5801 after 10:00 a.m. PST 
Fax: (808) 735-1105 « Email: swnh@aloha.com. 
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29 Eighteenth Street, N.W. 
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| 


Washington, DC 20036 


